Abstract
Full Text
MATHEMATICS
R. GABASOV, F. M. KIRILLOVA
A STATISTICAL PROBLEM OF OPTIMAL CONTROL OF A LINEAR SYSTEM
(Presented by Academician L. S. Pontryagin on 5 February 1965)
- Let, in the \(n\)-dimensional strictly normed phase space \(V\), on trajectories of the equation
\[ \dot{x}=A(t)x+a(t)b(t)u+r(t),\qquad x(t_0)=x_0,\qquad \|u\|\leq 1, \tag{1} \]
the quantity \(\|x(T)\|\) be minimized. Under sufficiently general assumptions the solution \(x(t)\big|_{t=T}\) can be represented in the form
\[ x(c,T)=Su-c, \tag{2} \]
and the quantity \(\Delta(c,S)=\min_{\|u\|\leq 1}\|x(c,T)\|\), and the optimal control \(u^0=u^0(t)\), are determined from the conditions (2):
\[ \Delta(c,S)=\max_{\|g\|=1,\; g\in V^*}\{(g,c)-\|S^*g\|\},\qquad (S^*g^0,u^0)=\max_{\|u\|\leq 1}(S^*g^0,u). \tag{3} \]
We shall call the maximization (3) the problem \(\Delta(c,S)\). Denote the extremal element for the problem \(\Delta(c,S)\) and the corresponding optimal control by \(g^0=g^0(c,S)\) and \(u^0(c,S)\). Suppose that the function \(r(t)\) describes some random process, and \(x_0\) is a random vector.
Problem. Let \(c\) be a random vector taking the values \(c_i\) with probabilities \(p_i\), \(\|c_i\|<\infty\), \(p_i>0\),
\[ \sum_{i=1}^{m}p_i=1,\qquad m<\infty. \]
It is required to find a control \(u^0\) for which
\[ \delta=\sum_{i=1}^{m}p_i\|x(c_i,T)\|=\min_{\|u\|\leq 1}. \]
- Denote \(\sum_{i=1}^{m}p_i y_i,\; y_i\in Y,\) by \(My\). Following (3), we reduce the stated problem to maximization with respect to \(mn\) variables:
\[ \delta= \max_{\max_i \|f_i\|=1,\; f_i\in V^*} \{M(f,c)-\|S^*Mf\|\} = M(f^0,c)-\|S^*Mf^0\|. \tag{4} \]
We simplify problem (4) for the case when \(f_i^0=\hat{g}\), \(i=1,2,\ldots,m\). In order that this case occur, it is necessary and sufficient that
\[ g^0(c_i,S)=\hat{g},\qquad i=1,2,\ldots,m. \tag{5} \]
If conditions (5) are satisfied, then
\[ \delta=\max_{\|g\|=1}\{(g,Mc)-\|S^*g\|\},\qquad u^0=u^0(Mc,S). \]
Thus, in the special case (5), the statistical problem is equivalent to the deterministic problem \(\Delta(Mc,S)\).
Suppose that conditions (5) do not hold. It is clear that \(\|M f^0\|<1\), and therefore there exists a vector \(\lambda\) such that
\[ \max_{\max_i \|f_i\|=1}\{M(f,c)-\|S^*Mf\|\} = M_c\|c-\lambda\|+(g,\lambda)-\|S^*g\|, \tag{6} \]
\[ Mf^0=g,\qquad (f_i^0,c_i-\lambda)=\|c_i-\lambda\|. \]
The vector \(\lambda\) satisfies relations (6) if and only if
\[ \min_x\{M_c\|c-x\|+(g,x)\} = M_c\|c-\lambda\|+(g,\lambda). \]
Hence it follows that
\[ \delta=\max_{\|g\|<1}\min_x\{M_c\|c-x\|+(g,x)-\|S^*g\|\} = \min_x\max_{\|g\|<1}\{M_c\|c-x\|+(g,x)-\|S^*g\|\}. \tag{7} \]
Thus, in the general case the problem has been reduced to a game, which always has a saddle point. We shall denote this point by the symbol \((f^0,x^0)\). Note that knowledge of one element of the point \((f^0,x^0)\) in problem (7) makes it possible to determine the other, since the relations
\[ (f^0,x^0)-\|S^*f^0\| = \max_{\|g\|\le \|f^0\|}\{(g,x^0)-\|S^*g\|\}, \]
\[ M_c\|c-x^0\|+(f^0,x^0) = \min_x\{M_c\|c-x\|+(f^0,x)\} \]
are valid.
Conversely, vectors \(f^0,x^0\) satisfying these relations are elements of a saddle point in the game (7).
In view of what has been said, the statistical problem is equivalent to the deterministic one \(\Delta(x^0,S)\). To find the point \((f^0;x^0)\), one may use the Arrow–Hurwicz method (3).
- Suppose
\[ \|x\|=M^{1/p}(x,x)^{p/2},\qquad p>1. \tag{8} \]
In the general case \(\|f^0\|<1\). Therefore \(\delta=\min_{\Delta(x,S)=0} M_c\|c-x\|\). Hence the point \(x^0\) lies on the boundary of the attainability region \(Q=\{x:\Delta(x,S)=0\}\) of system (1), while \(f^0\) separates the sets \(Q\) and \(R\), \(R=\{x:M_c\|c-x\|\le\delta\}\), if \(\Delta(M_c,S)\ge0\). Applying the reasoning used in deriving (3) (see (2)), for \(p=2\) we obtain
\[ \delta^2=D^2c+\Delta^2(Mc,S),\qquad D^2c=M(c,c)-(Mc,Mc). \tag{9} \]
As follows from (9), the optimal control \(u^0\) depends only on the mathematical expectation \(Mc\) of the random vector \(c\) (see, in this connection, work \((^4)\)). Using the interpretation of the vector \(f^0\) given above, one can indicate the characteristics of the random vector by which the optimal control in the problem of item 1 is determined. Denote by \(l_1,l_2,\ldots,l_n\) some basis of \(V\), with
\[ c_i=\sum_{j=1}^{n} a_j^i l_j. \]
The optimal control \(u^0\), for example for \(p=4\), depends on \(Mc\), \(Ma_k a_s\), \(Ma_k a_s a_m\), \(k,s,m=1,2,\ldots,n\).
- For an arbitrary specification of the norm of the space \(V\), it is, generally speaking, impossible to find characteristics of the random vector whose number would be smaller than the number of realizations: it is not difficult to construct an example in which the optimal control depends on the probability of each realization. The problem is considerably simplified in the one-dimensional case.
Denote by \(e\) (the mean) and \(d\) (the dispersion) quantities satisfying the relations
\[ \min_x M_c\|c-x\|=M_c\|c-e\|=d. \]
It follows from formulas (7) that \((f^0,e)>0\). Hence it is clear that for a one-dimensional space \(V\) the optimal control is completely determined by the mean \(e\). The quantities \(Mc\) and \(Dc\) of item 3 are the mean and the variance. In the general case one can show that \(\delta=d+q\Delta(e,S)\), \(q\in[0,1]\). From (7) one also obtains \(\delta=d+k\Delta(e,S)\), if \(\|g^0\|=k\), \(k=\max l\),
\[
\max_x \{Mc\|c-x\|+(g,x)\}=Mc\|c-e\|+(g,e),\quad \|g\|=l,\quad (g,e)>0.
\]
- Suppose that the scalar function \(\alpha(t)\) in equation (1) describes a random process. We shall restrict ourselves to the case when changes in the values of \(\alpha(t)\) are possible at the times \(t_k\), \(k=1,2,\ldots,l\), \(t_l=T\), and on each interval \([t_{k-1},t_k]\) the function \(\alpha(t)\) assumes one of the values \(\alpha_i\), \(i=1,2,\ldots,m\). Denote by \(p(i_1,i_2,\ldots,i_l)\) the probability of the event consisting in the realization, on the first interval, of the value \(\alpha_{i_1}\), on the second \(\alpha_{i_2}\), and so on.
Problem. Find a control for which
\[
\delta=M\|x(T)\|=\min_{\|u\|\le 1}.
\]
Following the scheme presented above, in the general case we obtain
\[
\delta=
\max_{\{\|f_k\|<M\alpha(t_{k-1})\}_1^l}
\min_{\{x_k\}_1^l}
\left\{
\sum_{\{i_k=1\}_1^l}^{m}
p(i_1,i_2,\ldots,i_l)
\left\|c-\sum_{k=1}^{l}\alpha_{i_k}x_k\right\|
+
\sum_{k=1}^{l}\bigl[(f_k,x_k)-\|S_k^*f_k\|\bigr]
\right\}
=
\min_{\{x_k\}_1^l}
\max_{\{\|f_k\|<M\alpha(t_{k-1})\}_1^l}
\{\ldots\ldots\}.
\]
The optimal control \(u_k^0\) on the interval \([t_{k-1},t_k]\) is determined by the relation
\[
(S_k^*f_k^0,u_k^0)=\max_{\|u\|\le 1}(S_k^*f_k^0,u).
\]
Here \(S_k^*\), just like \(S\), is expressed in terms of \(A(t)\), \(b(t)\).
This result may be interpreted as follows. The optimal control on the intervals \([t_{k-1},t_k)\) satisfies the maximum principle for the deterministic system
\[
\dot x=A(t)x+b(t)u,
\]
i.e.
\[
(\psi,bu^0)=\max_{\|u\|\le 1}(\psi,bu),\qquad
\dot\psi=-A^*(t)\psi,
\]
and at the same time the trajectory \(\psi(t)\) at the boundary of the interval \([t_{k-1},t_k)\), generally speaking, has a discontinuity. The question remains open of the explicit form of the relation between \(\psi(t_k-0)\), \(\psi(t_k)\), and also of the possibility of constructing a function \(m(t)\) such that the optimal control satisfies the maximum principle for the system
\[
\dot x=A(t)x+m(t)u.
\]
The problem of determining the extremal elements \(f_k^0\) is simplified for a norm of the form (8), where \(p=2\). Let, for example, \(l=2\). Then
\[
\delta^2
=
M\bigl[\omega^{-1}(\alpha(t_0)\xi+\alpha(t_1)\eta-\omega)\bigr]^2(c,c)
+
\]
\[
+\max_{f_1,f_2}
\bigl[
\omega^{-1}(c,\xi f_1+\eta f_2)-\|S_1^*f_1\|-\|S_2^*f_2\|
\bigr]M^{-1}\gamma,
\]
where
\[
\omega=M\alpha^2(t_0)M\alpha^2(t_1)-M(\alpha(t_0),\alpha(t_1));
\]
\[
\xi\equiv M\alpha(t_0)M\alpha^2(t_1)-M\alpha(t_1)M(\alpha(t_0),\alpha(t_1));
\]
\[
\eta\equiv M\alpha(t_1)M\alpha^2(t_0)-M\alpha(t_0)M(\alpha(t_0),\alpha(t_1));
\]
\[
\gamma=(\alpha(t_1)f_1-\alpha(t_0)f_2,\ \alpha(t_1)f_1-\alpha(t_0)f_2)\omega^{-1}.
\]
The computation of the elements \(f_k^0\) can be simplified further if restrictions are imposed on the random process \(\alpha(t)\). Suppose that the values assumed by the random function \(\alpha(t)\) in different intervals are independent and, moreover, \(M\alpha(t)=0\), \(t\in[t_{k-1},t_k)\), \(k\ge 2\). Then, for example, in the case \(l=2\) we have
\[
\delta^2
=
D^2\alpha(t_0)M^{-1}\alpha^2(t_0)(c,c)
+
\max_{f_1}
\left\{
M\alpha(t_0)M^{-1}\alpha^2(t_0)(c,f_1)-\|S_1^*f_1\|
-
\right.
\]
\[
\left.
-D\alpha(t_1)\bigl[1-(f_1,f_1)M^{-1}\alpha^2(t_0)\bigr]^{1/2}\|S_2^*\|
\right\},
\]
\[
\|S_2^*\|=\min_{\|f_2\|=1}\|S_2^*f_2\|=\|S_2^*f_2^0\|.
\]
- The results presented above can be used in solving deterministic problems. If the operator \(S\) in (2) is represented in quasidiagonal form with \(k\) identical blocks, then problem (3) of minimization with respect to \(n\) variables (and hence the corresponding optimal-control problem) reduces to a game of type (7), in which the dimension of the vectors is \(n/k\). The statistical problem posed in Sec. 1 can also be interpreted as the following deterministic one.
Let there be given in the phase space \(V\) a set \(\omega\) consisting of \(m\) points \(c_1, c_2, \ldots, c_m\). It is required to transfer the trajectory of equation (1), in the time \(T - t_0\), from the point \(x(t_0) = x_0\) to such a point \(x = a\) that
\[ \min_{\|u\|\le 1} \sum_{i=1}^{m} p_i \|c_i - x(T)\| = \sum_{i=1}^{m} p_i \|c_i - a\|, \]
where the \(p_i\) are prescribed numbers, \(p_i > 0\).
Closely connected with the latter is another problem: to find a control \(u^0\) for which
\[ \sum_{i=1}^{m} p_i \|x(c_i,T)\| = \min_{\|u\|\le 1}. \]
Here \(x(c_i,t)\) is the trajectory of equation (1) for \(x(t_0) = c_i\).
The problem of Sec. 5 can also be treated as the problem of hitting a neighborhood of some aggregate of points, or of simultaneously transferring the set \(\omega\) into a prescribed neighborhood of a point. In this case the admissible control may have different power.
Remarks. The investigation of the problem for a nonstrictly normed space \(V\) can be carried out by a limiting passage (5).
The problem of Sec. 1 was solved by introducing the space \(W\) with elements \(\{x(c_i,T)\}_1^m\), which is the direct sum of \(m\) spaces \(V\). If the vector \(c\) is specified by the probability function \(p(\sigma)\), \(\sigma < V\), then \(x(c,T)\) is an element of some function space \(W\). When \(W\) is reflexive (6), the results obtained admit a generalization. For the problem of Sec. 5 an analogous conclusion holds.
Ural Polytechnic Institute
named after S. M. Kirov
Received
3 XII 1964
REFERENCES
- L. S. Pontryagin, V. G. Boltyanskii, R. V. Gamkrelidze, E. F. Mishchenko, The Mathematical Theory of Optimal Processes, Moscow, 1961.
- R. Gabasov, F. M. Kirillova, DAN, 156, No. 5 (1964).
- K. J. Arrow, L. Hurwicz, H. Uzawa, Studies in Linear and Nonlinear Programming, Moscow, 1962.
- J. Kramer, collected translations, Mechanics, No. 4 (80) (1963).
- F. M. Kirillova, Applied Mathematics and Mechanics, 24, issue 2 (1960).
- Fan Ji, collected translations, Linear Inequalities and Related Questions, Moscow, 1959.