Problems in Dynamic Programming - 1

Problems from the book - Dynamic Programming and Optimal Control.

By Vamshi Jandhyala in mathematics

February 1, 2021

Problems from the book Dynamic Programming and Optimal Control - Vol 1.

Problem 1.14

A farmer annually producing $x_k$ units of a certain crop stores $(1-u_k)x_k$ units of his production, where $0 \leq u_k \leq 1$, and invests the remaining $u_kx_k$ units, thus increasing the next year’s production to a level $x_{k+1}$ given by

$$ x_{k+1} = x_k + w_ku_kx_k, k = 0,1,\dots,N-1. $$

The scalars $w_k$ are independent random variables with identical probability distributions that do not depend either on $x_k$ or $u_k$. Furthermore, $\mathbb{E}{w_k}=\overline{w} > 0$. The problem is to find the optimal investment policy that maximizes the total expected product stored over $N$ years

$$ \begin{align*} \mathbb{E}_{w_k} \left\{x_N + \sum_{k=0}^{N-1}(1-u_k)x_k \right\} \end{align*} $$

Show the optimality of the following policy that consists of the constant functions:

a. if $\overline{w} > 1, \mu_0^*(x_0)= \cdots =\mu_{N-1}^*(x_{N-1})=1.$

b. if $0 < \overline{w} < 1/N, \mu_0^*(x_0)= \cdots =\mu_{N-1}^*(x_{N-1})=0.$

c. if $1/N \leq \overline{w} \leq 1$,

$$ \mu_0^*(x_0)= \cdots =\mu_{N-\overline{k}-1}^*(x_{N-\overline{k}-1})=1\\
\mu_{N-\overline{k}}^*(x_{N-\overline{k}})= \cdots = \mu^*(x_{N-1}) =0, $$

where $\overline{k}$ is such that $1/(\overline{k}+1) < \overline{w} < 1/\overline{k}$.

Solution

We want to maximize,

$$ J_{\pi}(x_0) = \mathbb{E}_{w_k} \left\{x_N + \sum_{k=0}^{N-1}(1-u_k)x_k \right\} $$

Tail subproblems of length $1$: Assume that production level at the beginning of period $N-1$ is $x_{N-1}$. Clearly, no matter what happened in the past, the farmer should maximize over $0 \leq u_{N-1} \leq 1$, the expected stored produce $\mathbb{E}{x_{N-1}(1-u_{N-1}) + x_N}$ which can be written as

$$ \begin{align*} J_{N-1}(x_{N-1}) &= \max_{\substack{0 \leq u_{N-1}\leq 1}} \mathbb{E}{x_{N-1}(1-u_{N-1}) + x_{N-1}(1+w_{N-1}u_{N-1})} \\
&= \max_{\substack{0 \leq u_{N-1}\leq 1}} { x_{N-1}\left(2 + u_{N-1}(\overline{w}-1)\right)} \\
&= \begin{cases} x_{N-1}(1 + \overline{w}), \text{if }\overline{w} > 1 \text{ and } \mu^*_{N-1}(x_{N-1}) = 1 \\
2x_{N-1}, \text{if } 0 < \overline{w} < 1/N \text{ and } \mu^*_{N-1}(x_{N-1}) = 0 \\
x_{N-1}(2 - \mu^*_{N-1}(x_{N-1})(1 -\overline{w})), \text{if } 1/N < \overline{w} < 1 \\
\end{cases} \end{align*} $$

Tail subproblems of length $2$: Assume that production level at the beginning of period $N-2$ is $x_{N-2}$. It is clear that the farmer should maximize the expected stored produce during the period $N-2$ + expected stored produce of period $N-1$ given that an optimal policy will be used in period $N-1$, which is equal to

$$ \begin{align*} J_{N-2}(x_{N-2}) &= \max_{\substack{0 \leq u_{N-1}\leq 1}} \mathbb{E}{x_{N-2}(1-u_{N-2}) + J_{N-1}(x_{N-1})} \\
&= \begin{cases} x_{N-2}(1 + \overline{w})^2, \text{if }\overline{w} > 1 \text{ and } \mu^*_{N-2}(x_{N-2}) = 1 \\
3x_{N-2}, \text{if } 0 < \overline{w} < 1/N \text{ and } \mu^*_{N-2}(x_{N-2}) = 0 \\
x_{N-2}(3 - \mu^*_{N-2}(x^*_{N-2})(1 - 2\overline{w})), \text{if } 1/N < \overline{w} < 1 \\
\end{cases} \end{align*} $$

Tail subproblems of length $N-k$: Similarly, at the beginning of period $k$ the production level is $x_k$. It is clear that the farmer should maximize the expected stored produce during the period $k$ + expected stored produce of periods $k+1,\dots,N-1$ given that an optimal policy will be used in these periods.

By denoting by $J_k(x_k)$ the optimal stored produce, we have

$$ \begin{align*} J_{k}(x_{k}) &= \max_{\substack{0 \leq u_{N-1}\leq 1}} \mathbb{E}{x_k(1-u_k) + J_{k+1}(x_{k+1})} \\
&= \begin{cases} x_k(1 + \overline{w})^{N-k}, \text{if }\overline{w} > 1 \text{ and } \mu^*_{k}(x_k) = 1 \\
(N-k+1)x_k, \text{if } 0 < \overline{w} < 1/N \text{ and } \mu^*_{k}(x_k) = 0 \\
x_k(N-k+1 - \mu^*_k(x^*_k)(1 - (N-k)\overline{w})), \text{if } 1/N < \overline{w} < 1 \\
\end{cases} \end{align*} $$

$$ \begin{align*} \mu^*_k(x_{k}) = 0 \end{align*} $$

if $1-(N-k)\overline{w} \leq 0$, i.e. $\overline{w} \leq 1/(N-k)$ and

$$ \begin{align*} \mu^*_k(x_{k}) = 1 \end{align*} $$

if $1-(N-k)\overline{w} > 0$, i.e. $ \overline{w} > 1/(N-k)$