Full Text
MATHEMATICS
N. N. PETROV
ON THE EXISTENCE OF THE VALUE OF A PURSUIT GAME
(Presented by Academician N. N. Krasovskii on 7 VII 1969)
Consider the following differential game. The motion of the players \(P\) and \(E\) in the \(n\)-dimensional Euclidean space \(R_n\) is described by the equations \(P: dx/dt=f(x,u)\), \(E: dy/dt=g(y,v)\).
The functions \(f(x,u)\) and \(g(y,v)\) are defined, continuous, and satisfy a local Lipschitz condition in \((x,y)\) with a constant independent of \((u,v)\), on the sets \(R_n\times U\) and \(R_n\times V\), respectively. The compact sets \(U\subset R_r\) and \(V\subset R_s\) are the sets of values of the control functions \(u=u(t)\) and \(v=v(t)\). We shall assume that there exists a constant \(C>0\) such that the inequalities
\[
(x,f(x,u))\le C(\|x\|^2+1)=C((x,x)+1),
\]
\[
(y,g(y,v))\le C(\|y\|^2+1)=C((y,y)+1).
\]
hold.
Then the attainability sets from any point of the space in time \(T\) are bounded for each player.
We shall say that a function \(u(t)\) \((v(t))\) belongs to the class \(M(N)\) if it is measurable and \(u(t)\subset U\) \((v(t)\subset V)\).
Let \((x_0,y_0)\) be the initial positions of the players at the time \(t=0\). Both players know each other’s capabilities (i.e., the functions \(f,g\) and the sets \(U,V\)) and, moreover, at each time \(t\ge 0\) receive information about the quantities \((t,x(t),y(t))\), on the basis of which they choose their control. We now proceed to describe the class of admissible strategies.
Definition. Let \(\sigma=\{t_1<t_2<\ldots<t_n<\ldots\}\) be a sequence of nonnegative numbers with no finite accumulation points. Denote by \(\Sigma\) the set of all such \(\sigma\). An admissible strategy \(\mathcal U\) of player \(P\) will mean a pair \(\{\sigma,\overline U_\sigma\}\), where \(\sigma\in\Sigma\): \(\sigma=\{t_0=0,t_1,t_2,\ldots,t_n,\ldots\}\), and \(U_\sigma\) is a family of mappings \(U^{\sigma^k}\), \(k=0,1,\ldots,n,\ldots\), assigning to the quantities \((t_k,x(t_k),y(t_k))\) a function \(u=u_k(t)\in M\), defined for \(t\in[t_k,t_{k+1})\), \(k=0,1,\ldots,n,\ldots\). Such a strategy will be called piecewise-programmed*. We shall assume that the admissible strategies \(\mathcal V\) of player \(E\) are also piecewise-programmed strategies.
The pair \((\mathcal U,\mathcal V)\), as usual, will be called a situation. In view of our assumptions, in every situation \((\mathcal U,\mathcal V)\) there are defined trajectories of motion \(x=x(t)\), \(y=y(t)\) of the players \(P\) and \(E\) for \(t\in[0,\infty)\), and, consequently, the value of the payoff function \(K(\mathcal U,\mathcal V)=F(x(t),y(t))\) is defined, where \(F\) is some functional defined for any pair of solutions \((x(t),y(t))\) given on \([0,\infty)\). We shall assume that player \(P\) seeks to decrease the quantity \(K(\mathcal U,\mathcal V)\), and player \(E\) to increase it.
A situation \((\mathcal U_\varepsilon^*,\mathcal V_\varepsilon^*)\) is called an \(\varepsilon\)-equilibrium situation if, for any admissible strategies \(\mathcal U\) and \(\mathcal V\), the inequality
\[
K(\mathcal U_\varepsilon^*,\mathcal V)-\varepsilon
\le K(\mathcal U_\varepsilon^*,\mathcal V_\varepsilon^*)
\le K(\mathcal U,\mathcal V_\varepsilon^*)+\varepsilon
\]
holds.
The strategies \(\mathcal U_\varepsilon^*\) and \(\mathcal V_\varepsilon^*\) are called \(\varepsilon\)-optimal, and the limit
\[
\lim_{\varepsilon\to 0+} K(\mathcal U_\varepsilon^*,\mathcal V_\varepsilon^*)
\]
is called the value of the game.
* That is, \(t_n\to\infty\).
In the present paper we shall consider pursuit games of the following three types.
I. \(F(x(t), y(t))=\|x(T)-y(T)\|\).
II. \(F(x(t), y(t))=\min\limits_{t\in[0,T]}\|x(t)-y(t)\|\).
The duration of the game \(T\) in both cases is prescribed in advance and known to both players.
III. \(F(x(t),y(t))=\min\limits_{t\in\Delta} t\), where \(\Delta\) is the set of all \(t\) for which
\[
\|x(t)-y(t)\|=l .
\]
The number \(l\), the capture radius, is prescribed in advance and known to both players. If in some situations \(l\)-capture does not occur, we shall assume that \(F(x(t),y(t))=\infty\).
Let us first consider games with bounded time \(T\). A game of type I will be denoted by \(\Gamma(x_0,y_0,T)\), and its value by \(V(x_0,y_0,T)\). Let us introduce the auxiliary game \(\underline{\Gamma}(x_0,y_0,\sigma)\), which differs from \(\Gamma(x_0,y_0,\sigma)\) only in the players’ information state and in the class of admissible strategies. Let \(\sigma\in\Sigma:\ t_0=0<t_1<t_2<\cdots<t_m<T=t_{m+1}\) (the \(t_i\) greater than \(T\) are of no interest to us). In the game \(\underline{\Gamma}(x_0,y_0,\sigma)\), both players at each time \(t_k\) receive information about the quantities \((t_k,x(t_k),y(t_k))\), and player \(P\), in addition, about the choice by player \(E\) of the control \(v=v_k(t)\) on the interval \([t_k,t_{k+1})\).
An admissible strategy \(\mathcal V\) for player \(E\) in the game \(\underline{\Gamma}(x_0,y_0,\sigma)\) will mean a family of mappings \(V_\sigma^{(k)}\), \(k=0,1,\ldots,m\), assigning to the quantities \((t_k,x(t_k),y(t_k))\) a function \(v=v_k(t)\in N\), defined for \(t\in[t_k,t_{k+1})\).
An admissible strategy \(\mathcal U\) for player \(P\) in the game \(\underline{\Gamma}(x_0,y_0,\sigma)\) will mean a family of mappings \(U_\sigma^{(k)}\), \(k=0,1,\ldots,m\), assigning to the quantities \((t_k,x(t_k),y(t_k))\) and \(v=v_k(t)\) a function \(u=u_k(t)\in M\), defined for \(t\in[t_k,t_{k+1})\).
The game \(\overline{\Gamma}(x_0,y_0,\sigma)\) is defined analogously, with the sole difference that in it player \(P\) is discriminated.
Then the following theorems hold:
Theorem 1. The values of the games \(\underline{\Gamma}(x_0,y_0,\sigma)\) and \(\overline{\Gamma}(x_0,y_0,\sigma)\) exist and satisfy a Lipschitz condition with respect to the initial positions, uniformly in \(\sigma\in\Sigma\). If the vectograms of the systems are convex, then in both games there exists a saddle-point situation.
Let \(V(x_0,y_0,\sigma)\) and \(\overline V(x_0,y_0,\sigma)\) be the values of the games \(\underline{\Gamma}(x_0,y_0,\sigma)\) and \(\overline{\Gamma}(x_0,y_0,\sigma)\), respectively.
Theorem 2. The value of the game \(\Gamma(x_0,y_0,T)\) exists and is equal to
\[
\sup_{\sigma\in\Sigma} V(x_0,y_0,\sigma)
=
\inf_{\sigma\in\Sigma}\overline V(x_0,y_0,\sigma),
\]
and satisfies a Lipschitz condition with respect to \((x_0,y_0,T)\).
Analogous theorems hold in the case
\[
F(x(t),y(t))=\min_{t\in[0,T]}\|x(t)-y(t)\|,
\]
but their proofs are considerably more complicated. Let \(V_{\min}(x_0,y_0,T)\) be the value of this game, defined for \(T\in[0,\infty)\). It is easy to show that \(V_{\min}(x_0,y_0,T)\) does not increase on \([0,\infty)\), and, consequently, there exists
\[
\lim_{T\to\infty} V_{\min}(x_0,y_0,T)=V_0(x_0,y_0)\ge 0 .
\]
Let us now consider game III, which we shall denote by \(\gamma(x_0,y_0,l)\). It is natural to assume that \(0\le l\le \|x_0-y_0\|\). Then the following is true:
Theorem 3. For almost all \(l\in[V_0(x_0,y_0),\|x_0-y_0\|]\), the value of the game \(\gamma(x_0,y_0,l)\) exists.
In conclusion we shall make several remarks.
1°. Since for players \(P\) and \(E\) the attainability sets over a time not exceeding \(T\) may fail to be closed, the situation of 0-equilibrium in games I–III may fail to exist (as may also the value of the game \(\gamma(x_0, y_0, l)\) for all \(l \in [V_0(x_0,y_0), \|x_0-y_0\|]\)).
2°. It is not difficult to show that the trajectories of the players \(P\) and \(E\) obtained as a result of applying \(\varepsilon\)-optimal strategies may be chosen from among trajectories corresponding to piecewise-constant controls.
3°. It follows from the results of this paper that the pursuit problem with bounded time is posed correctly (the value of the game exists, is unique, and depends continuously on the initial data). On the other hand, the value of game III may fail to be a continuous function of the initial positions even in the case \(g(y,v)\equiv 0\). In paper \((^6)\), necessary and sufficient conditions were obtained for the continuity of the value of the game (with respect to \(x_0\)) under the following assumptions:
-
\(n=2,\quad U=\{u_1\ldots u_m\},\quad f(x,u)\) is holomorphic in a neighborhood of the initial point for each \(u=u_i\).
-
\(g(y,v)\equiv 0,\quad y_0=0\).
-
\(l=0\).
4°. Some results devoted to pursuit games with bounded time are contained in papers \((^{7-10})\). A proof of the existence of the value of the game in some special cases may be found in papers \((^{1-5})\).
Leningrad State University
named after A. A. Zhdanov
Received
10 IV 1969
REFERENCES
\(^1\) N. N. Krasovskii, A. I. Subbotin, Differential Equations, 4, No. 12, 2159 (1968).
\(^2\) N. B. Gusyatnikov, M. S. Nikolskii, DAN, 184, No. 3, 518 (1969).
\(^3\) B. A. Shokhet, L. A. Petrosyan, Lithuanian Mathematical Collection, 8, 20 (1968).
\(^4\) B. N. Pshenichnyi, DAN, 184, No. 2, 285 (1969).
\(^5\) C. Ryll-Nardzewski, Ann. Math. Stud., Adv. in Game Theory, No. 52, 113 (1964).
\(^6\) N. N. Petrov, Differential Equations, 5, No. 5 (1969).
\(^7\) W. H. Fleming, J. Math. Anal. and Appl., No. 3, 102 (1961).
\(^8\) L. A. Petrosyan, Vestnik LGU, No. 1, 42 (1968).
\(^9\) N. N. Krasovskii, Abstracts of the First All-Union Conference on Game Theory, Yerevan, 1968, p. 125.
\(^ {10}\) N. N. Krasovskii, Differential Equations, 5, No. 3, 407 (1969).