Advertisement

Dynamic Programming and Optimal Control 第三章习题

阅读量:

1 Solve the problem of Example 3.2.1 for the case where the cost function is (x(T))^2+\int_0^T(u(t))^2dtAlso, calculate the cost-to-go function J^*(t,x) and verify that it satisfies the HJB equation.
Solution. The scalar system \dot x(t)=u(t) with the constaint |u(t)|\leq 1 for all t\in [0,T].

3.2 A young investor has earned in the stock market a large amount if money S and plans to spend it so as to maximize his enjoyment through the rest of his life without working. He estimates that he will live exactly T more than years and that his capital x(t) should be reduced to zero at time T, i.e., x(T)=0. Also he models the evolution of his capital by the differential equation \frac{dx(t)}{dt}=\alpha x(t)-u(t) where x(0)=S is his initial capital, \alpha >0 is a given interest rate, and u(t)\ge 0 is his rate of expenditure. The total enjoyment he will obtain is given by \int_0^Te^{-\beta t}\sqrt{u(t)}dt Here \beta is some positive scalar, which serves to discount future enjoyment. Find the optimal \{u(t)|t\in[0,T]\}.
Solution. We have f(x,u)=\alpha x-u\;\;,\qquad g(x,u)=e^{-\beta t}\sqrt{u}giving the Hamiltonian as follows: H(x,u,p)=e^{-\beta t}\sqrt{u}+p(\alpha x-u) and the adjoint equation is
\dot p(t)=-\alpha p(t) yielding p(t)=C_1e^{-\alpha t}\qquad\text{for some constant }C_1Notice that here x(T)=0 is given, so p(T)=\nabla(h(x^*(T)))=0 is not true anymore.
\qquadThe optimal control is obtained by maximizing the Hamiltonian with respect to u, yielding
u^*(t)=\arg\max_u\left[e^{-\beta t}\sqrt{u}+C_1e^{-\alpha t}(\alpha x^*-u)\right]=\frac{e^{(\alpha -\beta)t}}{2C_1}\qquad (3.2.1)Then by the differiential equation of the system we get\dot{x}^*(t)=\alpha x^*(t)-\frac{e^{(\alpha -\beta)t}}{2C_1}Solving this equation, we obtain
x^*(t)=C_2e^{\alpha t}+\frac{e^{(\alpha -\beta)t}}{2C_1\beta}\qquad\text{for some constant }C_2And together with the initial condition x^*(0)=S and the final condition x^*(T)=0, we can get the exact values of C_1 and C_2. So u^*(t) in (3.2.1) gives the optimal control. \qquad\qquad\qquad\qquad\qquad\Box

3.9 Use the Minimum Principle to solve the linear-quadratic problem of Example 3.2.2.
Solution. The n-dimension linear-quadratic system is given by
\dot x(t)=Ax(t)+Bu(t) where A and B are given matrices, and the quadratic cost
x(T)'Q_Tx(T)+\int_0^T\left(x(t)'Qx(t)+u(t)'Ru(t)\right)dt where the matrices Q_T and Q are symmetric positive semidefinite, and the matrix R is symmetric positive definite.
\qquadThe Hamiltonian here is H(x,u,p)=x'Qx+u'Ru+p'(Ax+Bu) and the adjoint equation is
\dot p(t)=2Qx+A'p(t)\qquad (1) with the terminal conditon p(T)=\nabla h(x^*(T))=2Q_Tx^*(T)The optimal control can be obtained by minimizing the Hamiltonian with respect to u, yielding
u^*(t)=\arg\min_{u}\left\{x^*(t)'Qx^*(t)+u'Ru+p'(Ax^*(t)+Bu)\right\} Since \nabla_u\{x^*(t)'Qx^*(t)+u'Ru+p'(Ax^*(t)+Bu)\}=2Ru+B'p, we get u^*(t)=-\frac{1}{2}R^{-1}B'p(t)\qquad(2) together with the system function leading to \dot x^*(t)=Ax^*(t)-\frac{1}{2}BR^{-1}B'p(t)\qquad (3) So p(t) can be solved by (1) (But I don’t know the answer!!) , and then x^*(t) can be solved by (3).

3.11 Use the discrete-time Minimum Principle to solve Exercise 1.14 of Chapter 1, assuming that each w_k is fixed at a known deterministic value.
Solution. Let w_k=\overline{w} for some fixed number \overline{w}>0, the system is characterized by x_{k+1}=f_k(x_k,u_k)=x_k+\overline{w}u_kx_k and the cost functiom becomes J(u)=x_N+\mathop{\sum}\limits_{k=0}^{N-1}(1-u_k)x_k Then the Hamiltonian function can be written as H_k(x_k,u_k,p_{k+1})=(1-u_k)x_k+p_{k+1}(x_k+\overline{w}u_kx_k) By the Discrete-time Minimum Principle, for k=0,1,\cdots,N-1, we have u_k^*=\arg\mathop{\max}\limits_{u_k}H_k(x_k^*,u_k,p_{k+1})\qquad\qquad\qquad\qquad\;\; =\arg\mathop{\max}\limits_{u_k}\left[(p_{k+1}\overline{w}-1)u_kx_k+(p_{k+1}+1)x_k\right] =\begin{cases} 1, & \text{ if }\; p_{k+1}\overline{w}>1\\ 0, & \text{ if }\; p_{k+1}\overline{w}\leq1 \end{cases}\qquad\qquad\qquad(3.11.1) On the other hand, for k=0,1,\cdots,N-1, the adjoint equation reads p_k=\nabla_{x_k}H_k(x_k^*,u_k^*,p_{k+1})=(p_{k+1}\overline{w}-1)u_k^*+p_{k+1}+1\qquad(3.11.2) with the terminal condition p_N=\nabla_{g_N}(x_N^*)=1.
Combing (3.11.1) with (3.11.2), we can obtain the following argument
p_{k+1}\overline{w}>1\;\Rightarrow\;\mu_k^*=1\;\Rightarrow\;p_k=(\overline{w}+1)p_{k+1}\qquad (3.11.3) p_{k+1}\overline{w}\leq1\;\Rightarrow\;\mu_k^*=0\;\Rightarrow\;p_k=p_{k+1}+1\;\;\qquad (3.11.4) So by induction, we can easily conclude the following optimal control results:
(1) If \overline{w}>1, u_0^*=\cdots=u_{N-1}^*=1.
(2) If 0<\overline{w}<1/N, u_0^*=\cdots=u_{N-1}^*=0.
(3) If 1/N\leq\overline{w}\leq 1, u_0^*=\cdots=u_{N-\bar{k}-1}^*=1u_{N-\bar{k}}^*=\cdots=u_{N-1}^*=0 where \bar{k} is such that 1/{(\bar{k}+1)}<\overline{w}\leq 1/{\bar{k}}. \qquad\qquad\qquad\qquad\qquad\qquad\qquad\Box

3.12 Use the discrete-time Minimum Principle to solve Exercise 1.15 of Chapter 1, assuming that each \gamma_k and \delta_k are fixed at a known deterministic values.

全部评论 (0)

还没有任何评论哟~