Abstract
Full Text
UDC 518.731.343.1
MATHEMATICS
A. I. SUBBOTIN
DIFFERENTIAL GAMES WITH CONSTRAINTS ON PHASE STATES
(Presented by Academician N. N. Krasovskii on 8 I 1970)
A differential game is considered in which the payoff is the time until the phase point (x[t]) meets a certain set (\mathcal M). It is assumed that the motion of the controlled system must satisfy the constraint (x[t]\in \mathscr X). Formulations of game problems of this type are given, and the structure of optimal approximating strategies is described.
Let the motion of the system be described by the equation
[
dx/dt=f^{(1)}(t,x,u)+f^{(2)}(t,x,v),\qquad x(t_0)=x_0,
\tag{1}
]
where (x) is the phase vector of the system; (u) and (v) are the vectors of the players’ control actions, constrained by
[
u\in \mathcal U,\qquad v\in \mathcal V;
\tag{2}
]
the sets (\mathcal U) and (\mathcal V) are bounded and closed; the continuous functions (f^{(1)}, f^{(2)}) satisfy a Lipschitz condition with respect to (x).
We shall use the concepts and notation from paper ((^{1})), where a bibliography is also given. In particular, we shall use the following notation:
[
\mathcal F^{(1)}(t,x)=\operatorname{conv}{f^{(1)}(t,x,u):u\in\mathcal U},
]
[
\mathcal F^{(2)}(t,x)=\operatorname{conv}{f^{(2)}(t,x,v):v\in\mathcal V}.
]
Let a certain convex closed set (\mathscr X) be given in the phase space ({x}), such that the boundary (\Gamma) of this set is a smooth surface, and (\mathscr X\setminus\Gamma) is an open set. Suppose that the constraint (x[t]\in\mathscr X) is imposed on the motion (x[t]), which we shall treat as a kind of nonretaining holonomic constraint ((^{2})). A controlled system whose motion satisfies the constraint (x[t]\in\mathscr X) will be called a nonfree controlled system. To describe its motion, we proceed as follows. Let (n=n(x)) be a vector-function defined by the conditions: if (x\in\Gamma), then (n(x)) is the unit vector of the normal to the surface (\Gamma), inward with respect to (\mathscr X), constructed at the point (x); if (x\notin\mathscr X), then (n(x)) is the inward unit normal to the surface (\Gamma) drawn through the point of the surface (\Gamma) nearest to (x); if (x\in\mathscr X\setminus\Gamma), then (n(x)=0). Let
[
\mathcal N(t,x)=
\begin{cases}
\lambda n(x),\ \lambda\in[0,\,2R(t,x)], & \text{if } x\in\mathscr X,\
2R(t,x)n(x), & \text{if } x\notin\mathscr X,
\end{cases}
]
where (R(t,x)) is the radius of the smallest Euclidean neighborhood of zero containing the algebraic sum of the sets (\mathcal F^{(1)}(t,x)+\mathcal F^{(2)}(t,x)). Note that the function (\mathcal N=\mathcal N(t,x)) is upper semicontinuous with respect to inclusion, and each of the sets (\mathcal N(t,x)) is nonempty, convex, and closed.
Assuming that, when the point (x[t]) moves along (\Gamma), only normal reactions of the constraint arise, the motion of the nonfree controlled system (1), generated, for example, by an approximating strategy (U_a\div)
—by (\mathcal U_\Delta(t,x)) ((^1)) and the program control of the second player (v=v(t)\in\mathcal V), can be described by the contingent
[
dx_\Delta[t]/dt \in f^{(1)}(t,x_\Delta[t],u_\Delta[t])+
f^{(2)}(t,x_\Delta[t],v(t))+\mathcal N(t,x_\Delta[t]),
]
[
u_\Delta[t]=u_\Delta[\tau_i]\in\mathcal U_\Delta(\tau_i,x_\Delta[\tau_i])
\quad \text{for } t\in[\tau_i,\tau_{i+1}).
]
The motion (x_\Delta[t]) ((x_\Delta[t_0]=x_0)) of the nonfree controlled system, generated by the strategies (U) and (V), will be denoted by the symbol
(x_\Delta^{(\mathrm n)}[t,t_0,x_0,U,V]).
The results of paper ((^1)) carry over to the case of the game problem of bringing a nonfree controlled system to a given set (\mathcal M\subset\mathcal X). This problem, in the class of approximational strategies, is formulated as follows. Let
(U_a \div \mathcal U_\Delta(t,x)) be some approximational strategy of the first player. We introduce for it the quality index
[
\gamma^0(U_a)=\sup_{\varepsilon>0}\left|\lim_{\delta\to0}\sup\left(\sup_{x_\Delta[t]}\vartheta^\varepsilon_{x_\Delta[t]}\right)\right|,
\tag{3}
]
where (\vartheta^\varepsilon_{x_\Delta[t]}) is the moment when, for the first time,
(\rho(x_\Delta[t],\mathcal M)\le \varepsilon), (\rho(x,\mathcal M)) is the distance from (x) to (\mathcal M),
(\delta=\sup(\tau_{i+1}-\tau_i)), (i=0,1,2,\ldots),
(x_\Delta[t]=x_\Delta^{(\mathrm n)}[t,t_0,x_0,U_a,V_\tau]),
(V_\tau \div \mathcal F^{(2)}(t,x)) is the trivial strategy of the second player ((^1)).
Problem 1. It is required to construct a minimax strategy
(U_a^0 \div \mathcal U_\Delta^0(t,x)), for which
[
\gamma^0(U_a^0)=\lim_{\varepsilon\to0}\left(\inf_{\delta>0}\left[\inf_{U_a}\left(\sup_{x_\Delta[t]}\vartheta^\varepsilon_{x_\Delta[t]}\right)\right]\right);
\tag{4}
]
here (U_a) are all possible approximational strategies of the first player,
(x_\Delta[t]=x_\Delta^{(\mathrm n)}[t,t_0,x_0,U_a,V_\tau]).
We give some definitions which are analogues of the corresponding notions formulated in ((^1)) for free controlled systems.
Let a system of sets (\mathcal W(t)) ((t_0\le t\le\vartheta)) be given. We shall say that the system of sets (\mathcal W(t)) is strongly (u)-stable for the nonfree system if
(\mathcal W(t)\subset\mathcal X) for (t_0\le t\le\vartheta) and, whatever
(t_\in[t_0,\vartheta]), (w_\in\mathcal W(t_)) and
(\delta\in(0,\vartheta-t_]) may be, for any integrable function
(v(t)\in\mathcal V) there exists a motion
(x^{(\mathrm n)}[t,t_,w_,U_\tau,V_\Pi]), for which
(x^{(\mathrm n)}[t_+\delta,t_,w_,U_\tau,V_\Pi]\in\mathcal W(t_+\delta));
here (V_\Pi \div f^{(2)}(t,x,v(t))) is a program strategy of the second player,
(U_\tau \div \mathcal F^{(1)}(t,x)) is the trivial strategy of the first player ((^1)).
In an analogous way the notion of (u)-stability ((^1)) of a system of sets (\mathcal W(t)) is transferred. A strategy
(U_a^{(e)}\div\mathcal U_\Delta^{(e)}(t,x)), extremal to the system of sets (\mathcal W(t)), is defined here in the same way as in paper ((^1)).
Lemma 1. If (x_0\in\mathcal W(t_0)) and the system of sets (\mathcal W(t)) ((t_0\le t\le\vartheta)): ((1^0)) is strongly (u)-stable for the nonfree system or ((2^0)) is (u)-stable for the nonfree system and (\mathcal W(\vartheta)=\mathcal M), then the strategy
(U_a^{(e)}\div\mathcal U_\Delta^{(e)}(t,x)) extremal to the system of sets (\mathcal W(t)) ensures in case ((1^0)) the condition
[
\lim_{\delta\to0}\sup\left[\sup_{x_\Delta[t]}\left(\max_{t_0\le t\le\vartheta}\rho(x_\Delta[t],\mathcal W(t))\right)\right]=0,
\tag{5}
]
and in case ((2^0))—the condition
[
\lim_{\delta\to0}\sup\left[\sup_{x_\Delta[t]}\left(\min_{t_0\le t\le\vartheta}\rho(x_\Delta[t],\mathcal M)\right)\right]=0,
\tag{6}
]
where
(x_\Delta[t]=x_\Delta^{(\mathrm n)}[t,t_0,x_0,U_a^{(e)},V_\tau]).
We shall say that, from the position ({t_,x_}) ((t_0\le t_\le\vartheta)), the nonfree system positionally absorbs the set (\mathcal M) by the moment* (\vartheta), if
[
\sup_{\delta>0}\left(\sup_{V_a}\left[\inf_{x_\Delta[t]}\left(\min_{t_0\le t\le\vartheta}\rho(x_\Delta[t],\mathcal M)\right)\right]\right)=0,
]
where (x_\Delta[t]=x_\Delta^{(\mathrm{n})}[t,t_,x_,U_\tau,V_a]), and (V_a) are arbitrary approximation strategies of the second player.
By (\mathcal W^{(\mathrm{n})}(t,\vartheta)) we denote the set of all points (x) for which the constrained system positionally absorbs (\mathcal M) by the time (\vartheta) from the position ({t,x}).
Theorem 1. The collection of sets (\mathcal W^{(\mathrm{n})}(t,\vartheta)) ((t_0\le t\le \vartheta)) is (u)-stable for the constrained system. Let (\vartheta^0) be the smallest value of the parameter (\vartheta) for which (x_0\in\mathcal W^{(\mathrm{n})}(t_0,\vartheta)). Then the strategy
[
U_a^{(e)}\div \mathcal U_\Delta^{(e)}(t,x)
]
extremal to the system of sets (\mathcal W^{(\mathrm{n})}(t,\vartheta^0)) ((t_0\le t\le \vartheta^0)) is the desired minimax strategy of the first player.
In an analogous way, for the constrained system under consideration one can introduce the concepts of a maximin strategy of the second player
[
V_a^0\div \mathcal V_\Delta^0(t,x),
]
the concepts of (v)-stability and strong (v)-stability of systems of sets, and formulate a theorem on the maximin—extremal strategy
[
V_a^0\div \mathcal V_\Delta^{(e)}(t,x)
]
and on the saddle point of the game under consideration, analogous to the same theorem for the free controlled system (1).
Let us consider another type of game problem with a constraint on the phase vector (x[t]). Suppose again that a set (\mathcal M) is given, bringing the controlled system (1) to which is the aim of the first player. At the same time, the first player seeks to arrange the meeting of the point (x[t]) with the set (\mathcal M) so that the condition (x[t]\in\mathcal H) is fulfilled, where (\mathcal H) is some closed set containing (\mathcal M). Thus, here the condition (x[t]\in\mathcal H) is realized not by externally prescribed constraints, as was the case in problem 1, but must be ensured by a suitable choice of the control of the first player.
We shall call an approximation strategy of the first player (\mathcal H)-admissible if the condition
[
\sup_{\varepsilon>0}\left(\limsup_{\delta\to0}\left[\sup_{x_\Delta[t]}\left(\max_{t_0\le t\le \vartheta^\varepsilon_{x_\Delta[t]}}\rho\bigl(x_\Delta[t],\mathcal H\bigr)\right)\right]\right)=0;
]
is satisfied. Here (\vartheta^\varepsilon_{x[t]}) is the instant of time when first (\rho(x_\Delta[t],M)\le\varepsilon), (x_\Delta[t]=x_\Delta[t,t_0,x_0,U_a,V_\tau]), already being motions of the free system generated by the strategies (U_a) and (V_\tau).
For (\mathcal H)-admissible strategies (U_a) we introduce the quality index (\gamma^0(U_a)), defined by equality (3), where
[
x_\Delta[t]=x_\Delta[t,t_0,x_0,U_a,V_\tau].
]
We shall call an (\mathcal H)-admissible strategy (U_a^0\div \mathcal U_\Delta^0(t,x)) minimax if equality (4) is satisfied, where on the right-hand side the inf is taken over all possible (\mathcal H)-admissible approximation strategies (U_a).
Problem 2. Among the (\mathcal H)-admissible strategies
[
U_a\div \mathcal U_\Delta(t,x)
]
find a minimax strategy
[
U_a^0\div \mathcal U_\Delta^0(t,x).
]
A system of sets (\mathcal W(t)) ((t_0\le t\le \vartheta)) will be called (u)-stable in (\mathcal H) if (\mathcal W(t)\subset\mathcal H) for (t\in[t_0,\vartheta]) and, whatever (t_\in[t_0,\vartheta]), (w_\in\mathcal W(t_)), and (\delta\in(0,\vartheta-t_]) may be, for any integrable function (\nu(t)\in\mathcal P), among the motions
[
x[t,t_,w_,U_\tau,V_\Pi]\quad\bigl(\text{where } V_\Pi\div \mathcal F^{(2)}(t,x,\nu(t))\bigr)
]
there will be a motion (x(t)) satisfying either the condition
[
x(t_+\delta)\in\mathcal W(t_+\delta),
]
or the condition
[
x(t)\in\mathcal M\quad \text{for } t\in[t_,t_+\delta].
]
In an analogous way one formulates the concept of strong (u)-stability in (\mathcal H) (see the corresponding definition in (1)).
Lemma 2. If (x_0\in\mathcal W(t_0)), and the sets (\mathcal W(t)), (t_0\le t\le\vartheta), are: ((1^\circ)) strongly (u)-stable in (\mathcal H) and (\mathcal W(\vartheta)=\mathcal M), or ((2^\circ)) (u)-stable in (\mathcal H) and (\mathcal W(\vartheta)=\mathcal M), then the strategy
[
U_a^{(e)}\div \mathcal U_\Delta^{(e)}(t,x)
]
extremal to the system of sets (\mathcal W(t)) is (\mathcal H)-admissible and, in case ((1^\circ)), ensures condition (5), while in case ((2^\circ)) it ensures equality (6), where (x_\Delta[t]), (x_\Delta[t_0]=x_0), are motions of system (1) generated by the strategies
[
U^{(e)}\div \mathcal U_\Delta^{(e)}(t,x)
\quad\text{and}\quad
V_\tau\div \mathcal F^{(2)}(t,x).
]
We shall say that, from the position ({t_,x_}) ((t_0 \leq t_ \leq \vartheta)), the set (\mathcal M) is absorbed positionally* in (\mathcal X) by the moment (\vartheta), if
[
\sup_{\delta>0}\left(\sup_{V_a}\left[\inf_{x_\Delta[t],\, t_0\leq t\leq \vartheta}
\left(\min \rho\bigl(x_\Delta[t],\mathcal M\bigr)\right)\right]\right)=0;
]
here (x_\Delta[t]=x_\Delta[t,t_,x_,U_\tau,V_a]) are motions satisfying the condition (x_\Delta[t]\in\mathcal X) for all (t\in[t_,t_{}]), where (t_{*}) is the moment when (x_\Delta[t]\in\mathcal M) for the first time.
Let (\mathscr W(t,\vartheta\mid\mathcal X)) be the set of all points (x) for which the set (\mathcal M) is positionally absorbed in (\mathcal X) by the moment (\vartheta) from the position ({t,x}).
Theorem 2. The system of sets (\mathscr W(t,\vartheta\mid\mathcal X)) ((t_0\leq t\leq\vartheta)) is (u)-stable in (\mathcal X). Let (\vartheta^0) be the smallest value of the parameter (\vartheta) for which (x_0\in\mathscr W(t_0,\vartheta\mid\mathcal X)). Then the extremal strategy (U_a^{(e)}\doteq \mathcal U_\Delta^{(e)}(t,x)) for the system of sets (\mathscr W(t,\vartheta^0\mid\mathcal X)) is an (\mathcal X)-admissible minimax strategy of the first player (i.e., it is a solution of problem 2).
The results of paper (1) can also be extended to the case of the problem of a maximum strategy (V_a^0), which must generate motions satisfying the phase constraint (x[t]\in\mathcal X), and to the case of game problems with integral constraints on the resources of the players’ control actions. For differential games of the type considered in the present paper, a classification described at the end of article (1) may be proposed.
The author thanks N. N. Krasovskii for discussion of the work and valuable advice.
Sverdlovsk Branch
of the V. A. Steklov Mathematical Institute
Academy of Sciences of the USSR
Received
23 XII 1969
REFERENCES
- N. N. Krasovskii, A. I. Subbotin, DAN, 190, No. 3, 35 (1970).
- F. R. Gantmakher, Lectures on Analytical Mechanics, Moscow, 1956.