UDC 518-9

MATHEMATICS

Submitted 1968-01-01 | RussiaRxiv: ru-196801.28635 | Translated from Russian

Full Text

UDC 518-9

MATHEMATICS

L. A. PETROSYAN

A MAPPING ON A FAMILY OF DIFFERENTIAL PURSUIT GAMES

(Presented by Academician Yu. V. Linnik on 6 III 1967)

In works devoted to differential games (see (1–6)), pursuit games without restrictions on the admissible movements of the players have mainly been considered. Such restrictions complicate the game so much that the traditional technique for solving “in the small” becomes difficult to apply. In the present work a method is proposed for solving a game “in the small,” using the game-theoretic specificity of the problem and the specificity of “geometric” restrictions on the possible movements of the players.

Before proceeding to the method itself, let us describe the class of pursuit games to which it may be applied. In doing so we shall use the following notation: \(P\) is the pursuer, \(E\) is the pursued, \(\varphi=(\varphi_1,\ldots,\varphi_l)\) is the pursuer’s strategy, \(\psi=(\psi_1,\ldots,\psi_k)\) is the strategy of the pursued, \(\rho(x,y)\) is the Euclidean distance between the points \(x,y\) of the space \(R^n\), \(x(t)\) is the trajectory of the pursuer, and \(y(t)\) is the trajectory of the pursued.

Definition of the game.
1. The game is antagonistic.
2. The kinematic equations have the form
\[ \dot{x}_i=f_i(x,\varphi),\qquad i=1,\ldots,n,\qquad \varphi\in\Phi; \tag{1} \]
\[ \dot{y}_j=g_j(y,\psi),\qquad j=1,\ldots,n,\qquad \psi\in\Psi, \tag{2} \]
where \(f_i\) and \(g_j\) are smooth functions.
3. During the course of the game the player \(P\) cannot leave some convex closed set \(A\subset R^n\), and the player \(E\) cannot leave some convex closed set \(B\subset R^n\), with \(A\supset B\).
4. The duration of the game is bounded by the number \(T\ge 0\).
5. The aim of \(P\) is to minimize the quantity \(\rho(x(T),y(T))\) at the moment when the game ends (the player \(E\) pursues the opposite aim).

The game under consideration depends on the following parameters: the initial position of \(P\), the initial position of \(E\), and the duration \(T\); therefore we shall denote it by \(\Gamma(x,y,T)\).

Definition 1. The set of points \(C_P^T(x)\) into which the player \(P\) can get at the time \(T\) from the initial position \(x\), using his strategies \(\varphi\in\Phi\), is called the action set of the pursuer \(P\) (a generalization of the notion of the attainability set, see (4), to the case when the player \(P\) can move only in the set \(A\)). In an analogous way the action set of the pursued \(E\), \(C_E^T(y)\), is defined.

We shall assume in what follows that the classes of strategies of the players are such that the action sets of the players are always closed sets.

For any pair of points \(x\in A,\ y\in B\), define the function \(\hat{\rho}_T(x,y)\) as follows:
\[ \hat{\rho}_T(x,y)=\max_{\eta\in C_E^T(y)}\ \min_{\xi\in C_P^T(x)}\rho(\xi,\eta), \tag{3} \]
where \(\rho(\xi,\eta)\) is the Euclidean metric.

Definition 2. Let \(\hat{\rho}_{T}(x,y)=\rho(\xi,\eta)\), where \(\rho(\xi,\eta)\) is the Euclidean distance between the points \(\xi,\eta\). Any trajectory \(x^{*}(t)\) joining the points \(x,\xi\) (so that \(x(T)=\xi,\ x(0)=x\)) is called a conditionally optimal trajectory of the pursuer \(P\). Any trajectory \(y^{*}(t)\) joining the points \(y\) and \(\eta\) (so that \(y(0)=y,\ y(T)=\eta\)) will be called a conditionally optimal trajectory of the pursued \(E\).

Definition 3. Define a mapping \(\mathfrak{M}[\Gamma(x,y,T)]\) of the family of pursuit games \(\Gamma(x,y,T)\) into the set \(B\) as follows:

\[ \mathfrak{M}[\Gamma(x,y,T)] = \{M:\ M=\mathfrak{M}[\Gamma(x,y,T)]\} = \{M:\ \hat{\rho}_{T}(x,y)=\rho(\xi,M)\}. \]

A point \(M\in B\) is called the center of pursuit in the game \(\Gamma(x,y,T)\).

Let \(x^{*}(t), y^{*}(t)\) be conditionally optimal trajectories in the game \(\Gamma(x,y,T)\), and let
\[ \mathfrak{M}(t)=\{M(t):\ M(t)=\mathfrak{M}[\Gamma(x^{*}(t),y^{*}(t),T-t)]\}. \]
Suppose that the following conditions hold:

A. The set \(\mathfrak{M}(t)\) consists of one element for all \(0\le t\le T\) (the mapping \(\mathfrak{M}\) is single-valued along any pair of conditionally optimal trajectories).

B. \(M(t_{1})=M(t_{2})\) for all \(t_{1},t_{2}\in[0,T]\) (the mapping \(\mathfrak{M}\) is invariant along any pair of conditionally optimal trajectories).

Then the point \(M=\mathfrak{M}[\Gamma(x,y,T)]\) is called the unique invariant center of pursuit.

Theorem. Suppose that in the game \(\Gamma(x,y,T)\) there exists a unique invariant center of pursuit and \(\hat{\rho}_{T}(x,y)>0\); then there exists an equilibrium situation in pure strategies, the conditionally optimal trajectories are optimal, the optimal strategies of the players \(P\) and \(E\) consist at each instant of choosing the direction of the conditionally optimal trajectory, and the value of the game is equal to \(\hat{\rho}_{T}(x,y)\).

Proof. Let \(\varphi^{*},\psi^{*}\) be the strategies of the players \(P\) and \(E\) that choose at each instant the direction of the conditionally optimal trajectories. In the situation \((\varphi^{*},\psi^{*})\) the payoff function is exactly equal to \(\hat{\rho}_{T}(x,y)\), so that, to prove the theorem, it is sufficient to prove the inequality

\[ K(x,y;\varphi,\psi^{*})\ge \hat{\rho}_{T}(x,y)\ge K(x,y;\varphi^{*},\psi) \tag{4} \]

for all strategies \(\varphi\) and \(\psi\) of the players \(P\) and \(E\).

The left-hand part of inequality (4) is obvious, since the point \(M\) is the point in the set \(C_{E}^{T}(y)\) farthest from the set \(C_{P}^{T}(x)\), and when the player \(P\) deviates from the conditionally optimal trajectory his set of actions contracts, remaining in \(C_{P}^{T}(x)\), and the distance to the point \(M\) can only increase.

Let us prove the right-hand part of inequality (4). For this purpose consider the auxiliary game \(\Gamma_{\Delta}(x,y,T)\), in which \(P\), at each instant of time, has complete information about the segment of the trajectory of \(E\) for a time \(\Delta t>0\) ahead. This means, in particular, that at each instant of time he knows the point \(y(t+\Delta t)\) at which player \(E\) will arrive after time \(\Delta t\). Suppose now that player \(E\) deviates from his conditionally optimal trajectory and, for some time \(\delta>\Delta t\), chooses a control \(\psi\) different from \(\psi^{*}\). Since the center of pursuit is unique and \(\hat{\rho}_{T}(x,y)>0\), the point \(M'\), farthest in the set \(C_{T-\Delta t}^{E}(y(\Delta t))\) from the set \(C_{P}^{T}(x)\), is situated strictly closer to the set \(C_{P}^{T}(x)\) than the point \(M\). Aiming from the point \(x\) at the point \(M'\) (during the time \(\Delta t\)), the pursuer \(P\) can always guarantee approach to \(E\) to within a distance not greater than \(\hat{\rho}_{T-\Delta t}(x(\Delta t),y(\Delta t))\) (such aiming is indeed feasible owing to complete information about the segment of the trajectory of player \(E\) for the time

\(\Delta t > 0\) forward). However

\[ \hat{\rho}_{T-\Delta t}(x(\Delta t), y(\Delta t)) < \hat{\rho}_{T}(x, y), \tag{5} \]

since the point \(M'\) (which is the center of pursuit in the game \(\Gamma(x(\Delta t), y(\Delta t), T-\Delta t)\)) lies strictly closer to the set \(C_T^P(x)\) than the point \(M\) (by the definition and uniqueness of the center of pursuit). Inequality (5) is satisfied for all \(\Delta t > 0\). Letting \(\Delta t\) tend to zero, if we assume that in the limiting game \(P\) has information about the choice of player \(E\) at each instant of time, then arguments similar to those given above show that inequality (5) remains valid if at the initial instant of time the choice of player \(E\) does not coincide with the direction of the conditionally optimal trajectory. Thus, we may write

\[ \hat{\rho}_{T-0}(x(+0), y(+0)) < \hat{\rho}_{T}(x, y) \tag{6} \]

(the game \(\Gamma(x(+0), y(+0), T-0)\) is a pursuit game in which the choices of the players \(P, E\) at the initial instant of time are fixed). From (6) we obtain

\[ \rho_{T-\delta}(x(\delta), y(\delta)) < \hat{\rho}_{T}(x, y), \tag{7} \]

for the game in which player \(P\) has information only about the choice of the control \(\psi\) at each instant of time.

Under the assumption of the presence of discrimination (player \(P\) at each instant of time has information about the choice of player \(E\) at that instant), from (9) the right-hand side of inequality (5) follows immediately, since player \(P\) can always guarantee approach to within a distance not less than \(\rho_{T-\delta}(x(\delta), y(\delta))\), if the choice of \(E\) does not coincide with \(\psi^*\) on the time interval \([0,\delta]\). The theorem is proved.

Corollary. If in the game \(\Gamma(x,y,T)\) the assumption of discrimination is discarded, then the assertion of the theorem remains valid under the additional assumption that the value of the game in pure strategies exists.

Indeed, in this case knowledge of the choice of player \(E\) does not decrease the value of the game (the fundamental principle of game theory), and, consequently, the value of the game coincides with that in the game with discrimination, i.e., is equal to \(\hat{\rho}_{T}(x,y)\) and is realized on conditionally optimal trajectories.

Definition 4. The set of points \(x, y, T\) for which the mapping \(\mathfrak{M}[\Gamma(x,y,T)]\) is not single-valued is called a singular surface, and the corresponding games \(\Gamma(x,y,T)\) are called singular games. A singular surface is called a dispersion surface if, for any pair of conditionally optimal trajectories, the following holds: from the fact that at some instant of time \(0 \leq t \leq T\)

\[ [x^*(t), y^*(t)] \in DS \]

(the dispersion surface), it follows that \(t=0\) (i.e., in an equilibrium situation the optimal trajectories do not intersect the dispersion surface).

The theorem formulated above shows a simple method for solving the game in regions of space bounded by singular surfaces (i.e., “locally”). Examples of pursuit games in which an invariant center of pursuit exists are: simple pursuit (see \((^2)\)); simple pursuit in a half-plane (see \((^4)\))—in this case the only singular surface is a dispersion surface; the dynamic pursuit game in the presence of friction forces (see \((^{5,6})\))—from initial positions not belonging to a singular surface; the homicidal chauffeur (see \((^2)\))—in the region bounded by the barrier (see \((^2)\)) and the equivocal surface, etc.

Let us note that the method proposed by us for solving the game “locally” actually works in a broader region than Isaacs’s method for solving the game “locally” (see \((^2)\)), which is based on the solution of the fundamental

first-order partial differential equation. In particular, the universal surface that is a singular surface of this equation (see the mad-driver game (2)) is, in our case, not a singular surface.

The simplest example of a game in which there exists a singular surface distinct from the dispersal surface is the game “lion and man” (see (7)). The kinematic equations have the form

\[ \begin{aligned} \dot{x}_i &= \varphi_i, \qquad \varphi_1^2+\varphi_2^2=v^2, \qquad i=1,2;\\ \dot{y}_j &= \psi_j, \qquad \psi_1^2+\psi_2^2=u^2, \qquad j=1,2. \end{aligned} \]

The game takes place in the circle \(C\). The duration of the game is \(T=R/v\). It is easy to show that the center of the circle (the location \(P\)) and any point of the boundary (the location \(E\)) belong to a singular surface distinct from the dispersal surface.

Leningrad State University
named after A. A. Zhdanov

Received
2 III 1967

CITED LITERATURE

L. S. Pontryagin, UMN, 21, issue 4, 130 (1966).
R. Isaacs, Differential Games, N. Y., 1965.
N. N. Krasovskii, PMM, 27, issue 2 (1963).
L. A. Petrosyan, Dokl. AN ArmSSR, 42, issue 5 (1966).
L. A. Petrosyan, N. V. Murzov, DAN, 172, No. 6 (1967).
L. A. Petrosyan, Dokl. AN ArmSSR, 42, issue 6 (1966).
J. Littlewood, Mathematical Miscellany, Moscow, 1962.

Submission history

[v1] 1968-01-01

Full Text

A MAPPING ON A FAMILY OF DIFFERENTIAL PURSUIT GAMES

CITED LITERATURE

Submission history

Access Paper

Citation

Share

Related Papers

Feedback

UDC 518-9