ON A PROBLEM OF OPTIMAL DISCRETE CONTROL
Consider the multistage process
Submitted 1964-01-01 | RussiaRxiv: ru-196401.53754 | Translated from Russian

Abstract

Full Text

CYBERNETICS AND CONTROL THEORY

A. I. PROPOI

ON A PROBLEM OF OPTIMAL DISCRETE CONTROL

(Presented by Academician B. N. Petrov, 14 V 1964)

For the discrete problem of optimizing the terminal state, necessary and sufficient conditions of optimality and a scheme for an algorithm for computing the optimal control are given. These conditions may be regarded as an extension of the maximum principle of L. S. Pontryagin \((^1)\) to problems of this kind. The idea of the proof is based on reducing the original multistage problem to a problem of mathematical programming and considering the problem dual to it. The dual problem, in turn, decomposes into one-stage problems connected with one another in a simple way. The present paper follows the work of L. V. Kantorovich \((^2)\), in which the effectiveness of the dual approach to the general problem of mathematical programming was first indicated.

Consider the multistage process

\[ x(k+1)=Ax(k)+\varphi(u(k))\quad (k=0,1,\ldots,N-1). \tag{1} \]

Here \(x(k)=\{x_1(k),\ldots,x_n(k)\}\) determines the state of the process at the \(k\)-th step; \(x(k)\) is an element of the state space \(X\); \(A\) is a nonsingular matrix of size \((n\times n)\); \(u(k)=\{u_1(k),\ldots,u_r(k)\}\) determines the control action at the \(k\)-th step; \(u(k)\) may take values from some fixed set \(U\)

\[ u(k)\in U\quad (k=0,1,\ldots,N-1); \tag{2} \]

\(\varphi(u)=\{\varphi_1(u),\ldots,\varphi_n(u)\}\) is a function given on the set \(U\). The control time is fixed and consists of \(N\) steps. The quality of the control is evaluated by a functional of the terminal state

\[ J=F(x(N)). \tag{3} \]

Problem 1 consists in, for a given initial state \(x(0)\), finding an admissible (i.e., satisfying (2)) control \(\{u(0),\ldots,u(N-1)\}\) that transfers \(x(0)\) to such a point of the space \(X\) for which the functional (3) attains an extremal (in what follows, for definiteness, minimal) value.

It is assumed here that the following conditions are satisfied:

  1. The set \(U\) is compact, and the function \(\varphi(u)\) is continuous on it. Consequently, the set \(\varphi(U)\) is also compact, where \(\varphi(U)\) is the image of the mapping \(x=\varphi(u)\) of \(U\) into \(X\).

  2. The set \(\varphi(U)\) is convex.

  3. The functional (3) is continuous and convex in \(X\) (or concave, in maximization problems).

Problems of this kind arise, for example, when it is required to transfer the controlled object to a prescribed state \(x^*\) in a fixed time. Then a suitably chosen norm of the error \(e(N)=x(N)-x^*\) will determine the quality of the control, and the control will be the better, the less

the norm \(\|e(N)\|\). Of practical interest are, for example, the following norms:

\[ \sum_{i=1}^{n} e_i^2(N);\qquad \sum_{i=1}^{n} |e_i(N)|;\qquad \max_i |e_i(N)|. \tag{4} \]

Let us note here that a norm, by definition, is a convex function.

Moreover, if a method for solving such a problem is known, then, solving the problem successively for different \(N\), one can find the smallest integer \(N^0\) for which the minimum value \(\|e^0(N^0)\|=0\), and thereby obtain a solution of the optimal time problem.

Introduce in the state space \(X\) the sets

\[ R_1(x):\{x'\mid x'=Ax+\varphi(u);\ u\in U\} \]

and further, by induction,

\[ R_k(x):\{x'\mid x'=Ax''+\varphi(u);\ x''\in R_{k-1}(x);\ u\in U\}. \]

The set \(R_k(x)\) is the set of all states in \(X\) that can be reached from the initial point \(x\) in \(k\) steps by means of an admissible control. Since the solution of system (1) has the form

\[ x(k)=A^k x(0)+\sum_{j=0}^{k-1} A^{k-1-j}\varphi(u(j))\qquad (k=1,2,\ldots), \tag{5} \]

the sets \(R_k(x)\) can also be constructed directly from (5). It is not difficult to see that the sets \(R_k(x)\) \((k=1,2,\ldots)\) are compact and convex for any \(x\).

Now, instead of the original Problem 1, consider the following:

Problem 2. Among all elements \(x\in R_N(x(0))\), find one for which \(F(x)\) has the smallest value.

Since the set \(R_N(x(0))\) is compact and \(F(x)\) is a continuous functional on it, \(F(x)\) attains its lower bound on \(R_N(x(0))\). On the other hand, Problems 1 and 2 are obviously equivalent. Therefore the following is valid.

Theorem 1. If the set \(\varphi(U)\) is compact and \(F(x)\) is a continuous functional in \(X\), then an optimal control exists for any initial state \(x(0)\).

Let us note that the optimal control \(\{u^0(k)\}\) and the optimal final state \(x^0\) need not be unique, whereas the optimal value of the quality \(F(x^0)\) is always unique (for a prescribed initial state).

Define the adjoint (dual) system

\[ p(k)=A'p(k+1)\qquad (k=N,\ N-1,\ldots,2,1), \tag{6} \]

where \(p(k)=\{p_1(k),\ldots,p_n(k)\}\), and \(A'\) is the transposed matrix. “Time” in system (6) is directed backward, and the variables \(p(k)\) are completely determined by assigning the value \(p(N)\) at the final time. Introduce also the functions

\[ K(p(k+1),u(k))=\sum_{i=1}^{n} p_i(k+1)\varphi_i(u(k))\qquad (k=0,1,\ldots,N-1). \tag{7} \]

In what follows, for simplicity, we shall assume that \(F(x)\) is a differentiable function. In addition, for physical reasons one may assume that \(F(x)\) attains its minimum on the boundary of the set \(R_N(x(0))\) (this does not mean that all controls \(\{u(k)\}\) must take boundary values).

Theorem 2. In order that the control \(\{u^0(k)\}\) and the corresponding trajectory \(\{x^0(k)\}\) be optimal, it is necessary and sufficient that

so that the functions (7) attain a maximum with respect to \(u(k)\in U\) on the optimal control

\[ K\bigl(p^0(k+1),u^0(k)\bigr)=\max_{u(k)\in U} K\bigl(p^0(k+1),u(k)\bigr) \]
\[ (k=0,1,\ldots,N-1), \tag{8} \]

where the optimal values \(\{p^0(k)\}\) are determined from the system (6) with the boundary condition

\[ p^0(N)=-\operatorname{grad} F\bigl(x^0(N)\bigr). \tag{9} \]

For the proof, let us first consider an auxiliary problem, differing from Problem 1 in that the functional (3) is now linear and has the form \(J=(c,x(N))\), where \(c=\{c_1,\ldots,c_n\}\), and the parentheses denote the scalar product. Moreover, convexity of the set \(\varphi(U)\) is not required. A continuous analogue of the problem was considered by L. I. Rozonoer \((^3)\).

Lemma. In order that the control \(\{u^0(k)\}\) be an optimal solution of the auxiliary problem, it is necessary and sufficient that the conditions of Theorem 2 be satisfied with the boundary condition \(p^0(N)=-c\).

Thus, the variables \(\{p^0(k)\}\) in this problem do not depend on \(\{x(k)\}\) and can be determined in advance; the problem splits into \(N\) independent subproblems (8). The proof of the lemma is simple and is based on the obvious relation (the variables \(u(k)\) are independent!)

\[ \max_{\{u(k)\in U\}}\sum_{k=0}^{N-1} c' A^{N-1-k}\varphi(u(k)) = \sum_{k=0}^{N-1}\max_{u(k)\in U} c' A^{N-1-k}\varphi(u(k)). \]

Adding to both sides of this equality \(c'A^N x(0)\) and putting
\(p'(k+1)=c'A^{N-1-k}\) \((k=0,1,\ldots,N-1)\), we obtain the assertion of the lemma. Now let us prove the theorem.

Necessity. Let \(x^0(N)\) be an optimal final state. Since, by assumption, \(x^0(N)\) is a boundary point of the convex set \(R_N(x(0))\), there exists at this point a supporting hyperplane of the set \(R_N(x(0))\)

\[ (p^0(N),x(N))=(p^0(N),x^0(N)). \]

Since this hyperplane is tangent to the level surface \(F(x)=F(x^0(N))\), it follows that \(p^0(N)=\alpha \operatorname{grad} F(x^0(N))\), \(\alpha=\mathrm{const}\). Choosing \(\alpha=-1\) (since the system (6) is linear, the variables \(p(k)\) are determined up to a constant multiplier), by the definition of a supporting hyperplane we have
\((p^0(N),x(N))\le (p^0(N),x^0(N))\) for all points of the set \(R_N(x(0))\). Consequently, the linear functional \((p^0(N),x(N))\) attains a maximum on the set \(R_N(x(0))\). Now one may apply the lemma, and necessity is proved.

Sufficiency. Solve the auxiliary problem for an arbitrary \(p^0(N)\). As a result we obtain a boundary point \(x^0(N)\) of the set \(R_N(x(0))\). If, in this case, it turns out that (9) is satisfied, then the hyperplane
\((p^0(N),x(N))=(p^0(N),x^0(N))\) is tangent to the surface
\(F(x)=F(x^0(N))\) and therefore is separating for the convex sets \(R_N(x(0))\) and \(S^0\) (where the set \(S^0:\{x\mid F(x)>F(x^0(N))\}\) is, obviously, convex). Thus, for all points \(x\in R_N(x(0))\) we have \(F(x)\le F(x^0(N))\). The theorem is proved.

The proof of the theorem is constructive and makes it possible to build an effective algorithm for computing the optimal control.

For this purpose we “reverse” time in the adjoint system, multiplying on the left both sides of the equality (6) by the matrix \(A^*=(A^{-1})'\) (since the matrix \(A\), by assumption, is nonsingular, the inverse matrix \(A^{-1}\) exists)

\[ p(k+1)=A^*p(k)\qquad (k=1,2,\ldots,N-1). \tag{10} \]

Choose an arbitrary \(p(1)\) and, for \(p=p(1)\), solve the problem:

Problem 3. Find \(u\in U\) for which the function \(K(p,u)=\sum_{i=1}^{n}p_i\varphi_i(u)\) is maximal.

The solution of this problem for \(p=p(1)\) gives the zeroth approximation for \(u(0)\). Determining \(p(2)\) from (10) and solving Problem 3 for \(p=p(2)\), we find the zeroth approximation for \(u(1)\), etc., until the control \(\{u(0),\ldots,u(N-1)\}\), and hence the final state \(x(N)\), has been found. If in this case \(p(N)=\alpha\,\operatorname{grad} F(x(N))\), \(\alpha<0\), then the solution is completed; otherwise we set

\[ p^1(N)=p(N)-\varepsilon\,\operatorname{grad} F(x(N)),\quad \varepsilon>0. \tag{11} \]

(The vectors \(p(k)\) may be normalized.) Determining \(p^1(1)\) from the formula

\[ p^1(1)=p(1)-\varepsilon(A^{N-1})'\operatorname{grad}F(x(N)), \]

we repeat the process until equality (9) is achieved with the required accuracy.

In the general case one can guarantee only convergence of the process. However, if the quality criterion (3) is a quadratic or piecewise-linear function of the form (4), and the set \(\Phi(U)\) is a convex polyhedron, then, on the basis of methods of linear and quadratic programming, one can obtain algorithms that give the solution in a finite number of steps.

Thus, the proposed algorithm consists in the successive solution of \(N\) simple and uniform Problems 3 with \(r\) unknowns \(\{u_1,\ldots,u_r\}\), linked to one another by the relations (10), and in checking condition (9). If the dimension \(r\) of the vector \(u\) is small (\(r=1,2\)), then the solution of Problem 3 is trivial, and therefore the individual iterations of the algorithm in this case will be simple. If \(r\) is large, then methods of mathematical programming may be used to solve Problem 3.

Institute
of Automation and Telemechanics

Received
12 V 1964

REFERENCES

  1. L. S. Pontryagin, V. G. Boltyanskii, R. V. Gamkrelidze, Mathematical Theory of Optimal Processes, Moscow, 1961.
  2. L. V. Kantorovich, DAN, 28, No. 3, 212 (1940).
  3. L. I. Rozonoer, DAN, 127, No. 3, 520 (1959).

Submission history

ON A PROBLEM OF OPTIMAL DISCRETE CONTROL