MATHEMATICS
V. G. Boltyanskii
Submitted 1961-01-01 | RussiaRxiv: ru-196101.66472 | Translated from Russian

Abstract

Full Text

MATHEMATICS

V. G. Boltyanskii

SUFFICIENT CONDITIONS FOR OPTIMALITY

(Presented by Academician L. S. Pontryagin on 19 V 1961)

Theorems 1 and 2, formulated in the present note, are, in essence, the well-known principle of dynamic programming (see \((^1)\)) as applied to the problem of optimal rapid processes. As is known, the principle of dynamic programming, fully justified in the case of difference equations, has not hitherto been correctly justified for the case of differential equations. The arguments that are usually used to justify this principle (see, for example, \((^2)\), p. 80 ff.) require continuous differentiability of the function \(\omega(x)\) considered below, which is not satisfied even in the simplest examples. In Theorem 1 (and the corollary) a correct justification is given of a somewhat refined principle of dynamic programming (as applied to the problem of optimal rapid processes). The principle of dynamic programming appears in Theorem 1 as a sufficient (and not necessary, as is usually assumed) condition for optimality, and it is proved in such a degree of generality that it permits the inclusion of all known examples and a number of new ones, in particular nonlinear ones.

In addition to the sufficient condition for optimality in the form of the principle of dynamic programming, below there is also formulated a sufficient condition for optimality in the form of the maximum principle (Theorem 3), which is more difficult to formulate, but considerably more convenient for practical application.

The sufficient conditions for optimality obtained play an important role for the following reason. The maximum principle (\((^2)\), § 3) makes it possible in a number of cases to single out uniquely the trajectories that can be optimal. Are these trajectories in fact optimal? To answer this question in the case of linear systems one uses the theorem on the existence of optimal controls (\((^2)\), § 19): since optimal trajectories exist and since the maximum principle uniquely determines the trajectory that can be optimal, it is the (unique) optimal trajectory joining two prescribed points. However, the existence theorem has been proved only for linear systems, and fundamental difficulties are encountered in attempting to prove it for nonlinear systems. Therefore, for nonlinear systems (even the simplest ones), when carrying out synthesis on the basis of the maximum principle, there is no certainty that the trajectories found are in fact optimal. A way out of this situation is indicated by Theorem 3 formulated below, which, as a rule, makes it possible to assert that the synthesis carried out on the basis of the maximum principle does indeed lead to optimal trajectories.

We shall consider the motion of a controlled object in an \(n\)-dimensional phase space \(X\) of the variable \(x = (x^1, x^2, \ldots, x^n)\), described by the equations

\[ \dot{x}^i = f^i(x^1, \ldots, x^n, u), \qquad i = 1, \ldots, n. \tag{1} \]

We shall assume that some open set \(V \subset X\) and some topological space \(U\)—the domain of control—are given, and that the functions \(f^i\) and \(\partial f^i/\partial x^j\) are continuous on the direct product \(V \times U\). A piecewise-continuous function \(u(t)\) with values in \(U\), given on the interval \(t_0 \leq t \leq t_1\), will be called an admissible control with respect to the point \(x_0 \in V\) if, after substituting the function \(u = u(t)\) into system (1), the solution of this system with the initial condition \(x(t_0)=x_0\) is defined and lies in the domain \(V\) for \(t_0 \leq t \leq t_1\). If, moreover, this solution satisfies the relation \(x(t_1)=x_1\), where \(x_1 \in V\) is a given point, then we shall say that the control \(u(t)\), \(t_0 \leq t \leq t_1\), admissible with respect to the point \(x_0\), transfers the phase point from the position \(x_0\) to the position \(x_1\). The main problem considered in this note is the following:

In the domain \(V\), two points \(x_0, x_1\) are given; among all controls admissible with respect to \(x_0\) and transferring the phase point from the position \(x_0\) to the position \(x_1\), choose one that accomplishes the transition from \(x_0\) to \(x_1\) in the shortest time.

The control \(u(t)\) that gives a solution of this problem, and the corresponding trajectory \(x(t)\), will be called optimal in the domain \(V\).

We now introduce a concept important for what follows: that of a piecewise-smooth set. Let \(K\) be some bounded \(s\)-dimensional convex polyhedron \((s \leq n)\), situated in the vector space \(\Xi\) of the variable \(\xi=(\xi^1,\xi^2,\ldots,\xi^s)\) and considered together with its boundary (i.e., closed). Suppose that on some open set \(N\) of the space \(\Xi\), containing the polyhedron \(K\), a differentiable mapping \(\varphi:N \to X\) is given, possessing the property that the functional matrix \((\partial x^i/\partial \xi^j)\) has rank \(s\) at each point \(\xi \in K\), and that distinct points of the polyhedron \(K\) are mapped into distinct points of the space \(X\). In this case the image \(L=\varphi(K)\) of the polyhedron \(K\) will be called a curvilinear \(s\)-dimensional polyhedron in the space \(X\). Any set \(M \subset V\) representable as the union of a finite or countable number of curvilinear polyhedra of dimensions \(< n\), arranged in such a way that each closed bounded set lying in \(V\) intersects only a finite number of these polyhedra, will be called a piecewise-smooth set in \(V\). Clearly, a piecewise-smooth set in \(V\) contains no interior points.

Theorem 1. Suppose that some point \(a \in V\) is fixed and that in the domain \(V\) there is given a real continuous function \(\omega(x)\) having the following properties:

a) \(\omega(a)=0,\ \omega(x)<0\) for \(x \neq a\);

b) there exists in \(V\) such a piecewise-smooth set \(M\) that on the set \(V \setminus M\) the function \(\omega(x)\) is continuously differentiable with respect to \(x^1,x^2,\ldots,x^n\) and satisfies the condition:

\[ \sup_{u \in U}\sum_{\alpha=1}^{n}\frac{\partial \omega(x)}{\partial x^\alpha} f^\alpha(x,u)=1 \qquad \text{for } x \in V \setminus M. \tag{2} \]

Then, for any point \(x_0 \in V\) and for any control admissible with respect to the point \(x_0\) and transferring (by virtue of system (1)) the phase point from the position \(x_0\) to the position \(a\), the time of transition from \(x_0\) to \(a\) is not less than \(-\omega(x_0)\).

Theorem 2. Theorem 1 remains valid if the requirement of piecewise smoothness of the set \(M\) is replaced by the following requirements: the set \(M\) is closed in \(V\) and contains no interior points and, moreover, the function \(\omega(x)\) locally (in a neighborhood of each point \(x \in V\)) satisfies the Lipschitz condition.

Corollary. If, under the assumptions of Theorem 1 (or Theorem 2), for each point \(x_0 \in V\) there exists a control, admissible relative to the point \(x_0\), which transfers the phase point from the position \(x_0\) to the position \(a\) in time \(-\omega(x_0)\), then all these controls are optimal in \(V\).

Let us now suppose that piecewise-smooth sets \(P^0, P^1, P^2, \ldots, P^{n-1}\) are given:

\[ P^0 \subset P^1 \subset P^2 \subset \ldots \subset P^{n-1} \subset P^n = V \tag{3} \]

and a function \(v(x)\), defined in \(V\) and taking values in \(U\). We shall say that the sets (3) and the function \(v(x)\) realize a regular synthesis for system (1) in the domain \(V\), if the following conditions are satisfied:

A. The set \(P^0\) consists of points isolated in \(V\), which we shall call zero-dimensional cells. Each component of the set \(P^i \setminus P^{i-1}\) \((i = 1, 2, \ldots, n)\) is an \(i\)-dimensional smooth manifold in \(V\); these components we shall call \(i\)-dimensional cells. Every closed bounded set contained in \(V\) intersects only a finite number of cells. The function \(v(x)\), considered on each individual cell \(\tau\), is a continuous and continuously differentiable function of the point \(x \in \tau\). Moreover, this function can be extended to a continuously differentiable function defined in some neighborhood of the cell \(\tau\).

B. All cells are divided into cells of the first, second, and third kind. All \(n\)-dimensional cells are cells of the first kind. The only zero-dimensional cell of the first kind is the point \(a\).

C. If \(\tau\) is some \(i\)-dimensional cell of the first kind \((i > 0)\), then through every point of this cell there passes a unique trajectory of the system

\[ \dot{x}^i = f^i(x^1, \ldots, x^n, v(x)), \qquad i = 1, \ldots, n. \tag{4} \]

There exists such an \((i-1)\)-dimensional cell \(\Pi(\tau)\), which is a cell of the first or second kind, that every trajectory of system (4) going along the cell \(\tau\), after a finite time, leaves the cell \(\tau\), entering at a nonzero angle the cell \(\Pi(\tau)\) and approaching it with nonzero phase velocity.

D. If \(\tau\) is some \(i\)-dimensional cell of the second kind, then there exists an adjacent \((i+1)\)-dimensional cell \(\Sigma(\tau)\), which is a cell of the first kind, such that from any point of the cell \(\tau\) there issues a unique trajectory of system (4) going along the cell \(\Sigma(\tau)\).

E. The conditions listed above ensure the possibility of continuing the trajectories of system (4) from cell to cell through cells of the first and second kind: from the cell \(\tau\) to the cell \(\Pi(\tau)\), if the cell \(\Pi(\tau)\) is of the first kind, and from the cell \(\tau\) to the cell \(\Sigma(\Pi(\tau))\), if the cell \(\Pi(\tau)\) is of the second kind. It is required that each such trajectory pass through only a finite number of cells, i.e., that the “piercing” of cells of the second kind occur, for each trajectory, a finite number of times. In this case every trajectory ends at the point \(a\). It is also required that from each point belonging to a cell of the third kind there issue a unique trajectory of system (4) leading to the point \(a\). Thus, from each point \(x_0 \in V\) there leads a unique trajectory (of system (4)) to the point \(a\).

F. All trajectories indicated in E satisfy the maximum principle \((^2)\).

G. The time of motion along the trajectories indicated in E from any point \(x_0 \in V\) to the point \(a\) is a continuous function of the point \(x_0\).

Theorem 3. Suppose that the control domain \(U\) is a subset of the \(r\)-dimensional vector space of the variable \(u = (u^1, u^2, \ldots, u^r)\), and that the right-hand sides of system (1) are continuously differentiable with respect to \(u^1, u^2, \ldots, u^r\). If a regular synthesis for system (1) is realized in the domain \(V\), then all trajectories mentioned in E are optimal in the domain \(V\).

Theorems 1–3 are readily extended to the case in which the minimum of a certain integral functional is considered (we restricted ourselves to the case of optimality in the sense of time-optimality only for simplicity). The proof of Theorem 1 is carried out by methods of the theory of smooth manifolds (see, for example, (3)). The proof of Theorem 3 is based on Theorem 1 and makes it possible to establish the connection between the maximum principle and the principle of dynamic programming.

Mathematical Institute named after V. A. Steklov
Academy of Sciences of the USSR

Received
17 V 1961

References

  1. R. Bellman, Dynamic Programming, IL, 1960.
  2. L. S. Pontryagin, V. G. Boltyanskii, R. V. Gamkrelidze, E. F. Mishchenko, The Mathematical Theory of Optimal Processes, Moscow, 1961.
  3. L. S. Pontryagin, Trudy Instituta im. V. A. Steklova AN SSSR, 45 (1955).

Submission history

MATHEMATICS