Abstract
Full Text
Doklady of the Academy of Sciences of the USSR
1958, Volume 119, No. 6
MATHEMATICS
V. G. Boltyanskii
THE MAXIMUM PRINCIPLE IN THE THEORY OF OPTIMAL PROCESSES
(Presented by Academician P. S. Aleksandrov, 19 XII 1957)
The present note is connected with our joint work \((^1)\) on the theory of optimal processes. The maximum principle, stated by L. S. Pontryagin as a hypothesis (on the basis of R. V. Gamkrelidze’s investigation of second variations, see \((^1)\)), is here proved completely as a necessary condition for optimality. To obtain this result one has to carry out anew the constructions indicated in note \((^1)\), considering other “perturbations” (variations).
We consider the motion of a point \(x=(x^1,\ldots,x^n)\) in an \(n\)-dimensional phase space \(X\) according to the law:
\[ \dot{x}=f^i(x^1,\ldots,x^n,u)=f^i(x,u),\qquad i=1,\ldots,n, \tag{1} \]
where \(u\) is a control parameter, prescribed to vary within some domain \(U\). The “domain” \(U\) may be any topological space, for example an arbitrary closed set of the \(r\)-dimensional space of the variable \(u=(u^1,\ldots,u^r)\); the functions \(f^i\) are assumed to depend continuously on the pair of variables \(x,u\) and to be continuously differentiable with respect to \(x^1,\ldots,x^n\). If a control law is given, i.e., a variable point \(u(t)\in U\) is given, then system (1) uniquely determines the law of motion of the point. In doing so we shall impose on the control \(u(t)\) the condition of piecewise continuity.
As in \((^1)\), the problem is posed: to choose the control \(u(t)\) in such a way that the point passes from a given position \(x_0\in X\) to another given position \(x_1\in X\) in minimum time. The desired control \(u(t)\) and the corresponding trajectory \(x(t)\) are called optimal.
Let \(u(t)\), \(t_0\leq t\leq t_1\), be some control taking values in the domain \(U\). Choose certain instants of time \(\tau_1,\ldots,\tau_k\), where \(t_0<\tau_1<\cdots<\tau_k<t_1\), and let \(\delta t_{i,j}\), \(j=1,\ldots,s_i,\ i=1,\ldots,k\), be arbitrary nonnegative numbers, and \(v_{i,j}\) arbitrary points of the domain \(U\). Denote by \(I_{i,j}\) the interval
\[ \tau_i+\varepsilon(\delta t_{i,1}+\cdots+\delta t_{i,j-1}) \leq t< \tau_i+\varepsilon(\delta t_{i,1}+\cdots+\delta t_{i,j}) \]
and set
\[ \bar{u}(t)= \begin{cases} v_{i,j}, & \text{if } t\in I_{i,j},\\ u(t), & \text{if } t \text{ belongs to none of the intervals } I_{i,j}. \end{cases} \]
Here \(\varepsilon\) is a positive number on which the control \(\bar{u}(t)\) depends. In what follows \(\varepsilon\) will be regarded as a variable quantity of the first order of smallness; quantities of higher order of smallness will be discarded and denoted by dots. We shall say that the control \(\bar{u}(t)\) is obtained by a variation of the control \(u(t)\) near the instants of time \(\tau_1,\ldots,\tau_k\).
Let \(\bar{x}(t)\) denote the trajectory issuing from the same point \(x_0\) and corresponding to the trial control \(\bar{u}(t)\), and consider the vector
\[ \bar{x}(t_1-\varepsilon\delta t)-x(t_1), \tag{2} \]
where \(\delta t\) is a nonnegative number. We denote by \(\varepsilon\Delta x\) the part of the vector (2) that is linear with respect to \(\varepsilon\):
\[ \bar{x}(t_1-\varepsilon\delta t)=x(t_1)+\varepsilon\Delta x+\cdots . \tag{3} \]
If we now take all possible controls \(\bar{u}(t)\) obtained by varying the control \(u(t)\), and independently of this assign to \(\delta t\) all possible nonnegative values, then the vectors \(\Delta x\) issuing from the point \(x_1=x(t_1)\) fill out a certain cone with vertex at the point \(x_1\). We shall call it the cone of attainability at the point \(x_1\) and denote it by \(K\). By means of fairly simple arguments one proves the validity of the following proposition:
Lemma. The cone of attainability \(K\) is convex.
Next, the following theorem holds (which is a strengthening of the theorem, stated in note \((^1)\), that the dimension of the linear manifold \(P'\) constructed there does not exceed \(n-1\)):
Theorem 1. If the cone of attainability \(K\) fills the whole space \(X\) of the variables \(x^1,\ldots,x^n\), then the trajectory \(x(t)\) and the corresponding control \(u(t)\), \(t_0\leq t\leq t_1\), are not optimal.
Thus, if \(x(t), u(t)\) are optimal, then the convex cone \(K\) does not fill the whole space \(X\), and therefore is situated entirely in one half-space determined by some hyperplane passing through the point \(x_1\). Let
\[ a_\alpha(x^\alpha-x_1^\alpha)=0 \]
be the equation of such a hyperplane. We may assume here (changing, if necessary, the signs of all the numbers \(a_\alpha\)) that the whole cone \(K\) is situated in the negative half-space \(a_\alpha(x^\alpha-x_1^\alpha)\leq 0\).
If now we put (as in work \((^1)\)) \(a_\alpha\varphi^\alpha_\beta(t_1)=b_\beta,\ b_\beta\psi^\beta_\gamma(t)=\psi_\gamma(t)\), then we obtain \(\psi_\alpha(t_1)=a_\alpha\), and the equation of the negative half-space takes the form \(\psi_\alpha(t_1)(x^\alpha-x_1^\alpha)\leq 0\). In connection with this we obtain the following necessary condition for optimality:
\[ \psi_\alpha(t_1)\Delta x^\alpha\leq 0 \tag{4} \]
(where \(\Delta x\) is determined by equality (3)) for any trial control \(\bar{u}(t)\) and any \(\delta t\geq 0\).
This condition may be regarded as a generalization of conditions (6), (7) of work \((^1)\). Under the variations \(\delta u(t)\) considered in work \((^1)\), one also obtains a certain cone of attainability, which, however, may turn out to be smaller than the cone \(K\) constructed here. Accordingly, it may turn out that the numbers \(a_\alpha\) and the functions \(\psi_\alpha(t)\) satisfying all the conditions given in work \((^1)\) do not satisfy condition (4). In other words, condition (4) is somewhat stronger than conditions (6), (7) of work \((^1)\).
From (4) it follows easily that the inequality
\[ \psi_\alpha(t_1)f^\alpha(x(t_1),u(t_1))\geq 0, \]
holds, or, in other words, that the function
\[ H(x,\psi,u)=\psi_\alpha f^\alpha(x,u) \]
takes a nonnegative value at the point \((x(t_1),\psi(t_1),u(t_1))\). As L. S. Pontryagin observed, the function \(H(x,\psi,u)\) plays an essential role,
in the theory of optimal processes. In particular, relations (1) and (5) of paper (1) may be rewritten in the form
\[ \dot{x}^{i}=\frac{\partial H}{\partial \psi_i},\qquad \dot{\psi}_i=-\frac{\partial H}{\partial x^i};\qquad i=1,\ldots,n. \tag{5} \]
Theorem 2 (L. S. Pontryagin’s maximum principle). For the optimality of the trajectory \(x(t)\) and the control \(u(t)\), \(t_0\leq t\leq t_1\), it is necessary that there exist a variable vector \(\psi(t)\) such that (in addition to the validity of relations (5)) for every \(t\), \(t_0\leq t\leq t_1\), the condition
\[ H(x(t),\psi(t),u(t))=\max_{u\in U} H(x(t),\psi(t),u)\geq 0 \]
is satisfied.
We give a sketch of the proof of this theorem. Choose the vector \(\psi(t)\) as indicated above. Suppose that for some \(\tau\), \(t_0<\tau<t_1\), and some \(v\in U\) we have
\[ H(x(\tau),\psi(\tau),u(\tau))<H(x(\tau),\psi(\tau),v). \]
We needle-vary the control \(u(t)\), taking the single interval \(I_{1,1}\) of length \(\varepsilon\delta t_{1,1}\), beginning at the point \(\tau\), and set \(v_{1,1}=v\). Then we shall have: \(\bar{x}(t)=x(t)\) for \(t_0\leq t\leq \tau\), and
\[ \bar{x}(t)-x(t)=[f(x(\tau),v)-f(x(\tau),u(\tau))](t-\tau)+\cdots \]
for \(\tau\leq t\leq \tau+\varepsilon\delta t_{1,1}\).
In particular,
\[ \bar{x}(\tau+\varepsilon\delta t_{1,1})-x(\tau+\varepsilon\delta t_{1,1}) =[f(x(\tau),v)-f(x(\tau),u(\tau))]\varepsilon\delta t_{1,1}+\cdots, \]
whence it follows easily (by the definition of the function \(H\)) that
\[ \psi_\alpha(\tau+\varepsilon\delta t_{1,1}) [\bar{x}^{\alpha}(\tau+\varepsilon\delta t_{1,1})-x^{\alpha}(\tau+\varepsilon\delta t_{1,1})] =A\varepsilon+\cdots, \tag{6} \]
where \(A>0\). Since on the interval \(\tau+\varepsilon\delta t_{1,1}<t\leq t_1\) the control \(\bar{u}(t)\) coincides with \(u(t)\), on this interval we have: \(\bar{x}^{\alpha}(t)=x^{\alpha}(t)+\varepsilon\delta x^{\alpha}(t)+\cdots\), where
\[ \delta\dot{x}^{i}(t)= \frac{\partial f^{i}(x(t),u(t))}{\partial x^{\alpha}}\, \delta x^{\alpha}(t),\qquad i=1,\ldots,n. \]
Thus, for \(\tau+\varepsilon\delta t_{1,1}\leq t\leq t_1\), we obtain (see the second of relations (5)):
\[ \frac{d}{dt}\bigl(\psi_\alpha(t)\delta x^{\alpha}(t)\bigr) =\dot{\psi}_\alpha(t)\delta x^{\alpha}(t)+\psi_\alpha(t)\delta\dot{x}^{\alpha}(t)= \]
\[ =-\frac{\partial f^{\beta}}{\partial x^{\alpha}}\psi_\beta(t)\delta x^{\alpha}(t) +\psi_\alpha(t)\frac{\partial f^{\alpha}}{\partial x^{\beta}}\delta x^{\beta}(t)=0. \]
Consequently, on the entire interval \(\tau+\varepsilon\delta t_{1,1}\leq t\leq t_1\) the relation
\[ \psi_\alpha(t)\delta x^{\alpha}(t) =\psi_\alpha(\tau+\varepsilon\delta t_{1,1})\delta x^{\alpha}(\tau+\varepsilon\delta t_{1,1})=A>0 \]
is valid (see (6)), and therefore
\[ \psi^{\alpha}(t_1)[\bar{x}^{\alpha}(t_1)-x^{\alpha}(t_1)] =\psi_\alpha(t_1)[\varepsilon\delta x^{\alpha}(t_1)+\cdots]=A\varepsilon+\cdots. \]
Thus the vector \(\Delta x=\delta x(t_1)\), obtained from formulas (2), (3) with \(\delta t=0\) (which is permissible), satisfies the condition \(\psi_\alpha(t_1)\Delta x^{\alpha}=A>0\). But this contradicts the necessary condition (4).
Closely connected with the maximum principle is the following important property of the function \(H\), from which, in particular, it follows that along an opti-
normal trajectory \(H \ge 0\), i.e., the maximum, whose existence was shown above, is nonnegative.
Theorem 3. If \(x(t)\) is an optimal trajectory, \(u(t)\), \(t_0 \le t \le t_1\), is the corresponding optimal control, and \(\psi(t)\) is the variable vector whose existence is asserted in Theorem 2, then the quantity \(H\) retains along \(x(t)\), \(\psi(t)\), \(u(t)\) a constant (nonnegative) value:
\[ H\bigl(x(t),\psi(t),u(t)\bigr)=\mathrm{const}. \]
This relation is first proved for each interval of continuity of the control \(u(t)\). Then it is established that at every point of a “jump” (i.e., at a point of discontinuity of the control \(u(t)\)) the function \(H\) does not change its value.
The results published here were obtained by me while working in the seminar on the theory of oscillations and automatic control directed by L. S. Pontryagin. L. S. Pontryagin pointed out to me one simplification in the proof of the maximum principle, thanks to which my proof became suitable for an arbitrary topological space \(U\) (the initial version of the proof contained a superfluous construction, in fact unused anywhere, which forced one to restrict oneself to the case when \(U\) is a closed domain of a vector space with a piecewise-smooth boundary and convex interior angles at the corner points).
V. A. Steklov Mathematical InstituteAcademy of Sciences of the USSR Received
18 XII 1957
REFERENCES CITED
- В. Г. Болтянский, Р. В. Гамкрелидзе, Л. С. Понтрягин, DAN, 110, No. 1, 7 (1956).