Regularization of certain pursuit problem
V. E. Tret'yakov
Submitted 1967-01-01 | RussiaRxiv: ru-196701.87226 | Translated from Russian

Abstract

The problem of the minimax time $T$ until encounter with respect to a subset of selected coordinates is considered for two linear controllable objects described by identical equations:

\begin{gather}\dot{y}=Ay+Bu\quad\dot{z}=Az+Bv,\notag\y_{i_k}(\tau+T^0)=z_{i_k}(\tau+T^0),\quad T^0=\min_u\max_vT_{u,v}\tag{1}\label{1}\(i=1,\dots,n;k=1,\dots,m\le n).\notag\end{gather}

It is assumed that the control actions are constrained by integral conditions of the form:

\begin{equation}
\int_\tau^\infty|u(t)|^2\,dt\le\mu^2(\tau),\quad\int_\tau^\infty|v(t)|^2\,dt\le\nu^2(\tau)\tag{2}.
\label{2}
\end{equation}

The problem \eqref{1} under consideration has no solution under conditions \eqref{2}. The possibility of regularizing this problem is proven. In an effective form, by means of a slight increase in the pursuer's resource $\mu(\tau)$, an optimal $\mathcal{R}$-strategy is constructed, which ensures a result arbitrarily close to $\inf_u\sup_v T$.

Full Text

Preamble

This work continues the investigation of differential games and control problems under uncertainty, building upon the foundations established in [1–6]. We consider the motion of two controlled objects, $y(t)$ and $z(t)$, whose dynamics are governed by the following systems of differential equations:
$$ \dot{y} = Ay + Bu \quad (1.1) $$
$$ \dot{z} = Cz + Dv \quad (1.2) $$
where $y$ and $z$ are $n$-dimensional state vectors, and $u = {u_j}$ and $v = {v_j}$ are control vectors for the pursuer and the evader, respectively. The control constraints are defined by the integral norms (see [3], p. 209):
$$ \int_{\tau}^{\vartheta} |u(t)|^2 dt \le \mu^2(\tau), \quad \int_{\tau}^{\vartheta} |v(t)|^2 dt \le \nu^2(\tau) \quad (1.3) $$
where $\mu(\tau)$ and $\nu(\tau)$ represent the available resource capacities at time $\tau$. The objective of the pursuer $y$ is to achieve a state $y(\vartheta) = z(\vartheta)$ at some terminal time $\vartheta$, while the evader $z$ attempts to prevent this encounter or maximize the distance at the terminal time.

Let the current state at time $\tau$ be given by $y(\tau)$ and $z(\tau)$. We define the miss distance at the terminal time $\vartheta$ as $x(\vartheta) = y(\vartheta) - z(\vartheta)$. The optimal strategies $u[t]$ and $v[t]$ are sought in the class of feedback controls:
$$ u[t] = u[y(t), z(t), \mu(t), \nu(t)], \quad v[t] = v[y(t), z(t), \mu(t), \nu(t)] \quad (1.4) $$
The game is considered over the interval $[\tau, \vartheta]$. Following the methods in [5], we assume the existence of a value function $G(y, z, \mu, \nu, \vartheta)$ that satisfies the corresponding Hamilton-Jacobi-Isaacs equations.

1. Optimal Pursuit Strategies

For the linear systems (1.1) and (1.2), the predicted terminal states at time $\vartheta$ can be expressed using the fundamental transition matrices $X(t)$ and $Z(t)$. Let $s = \text{rank } K > m$, where $K$ is the controllability matrix. The pursuer's strategy is determined by the resource constraint (1.3) and the requirement to minimize the norm of the terminal difference $|x(\vartheta)|$. We define the auxiliary function:
$$ H(m, \vartheta - t) = X(\vartheta - t)B \quad (1.8) $$
The optimal control $u^0(t)$ that minimizes the energy functional while reaching the target is given by:
$$ u^0(t) = H^T(\vartheta - t) D^{-1}(\tau, \vartheta) x(\tau) \quad (1.9) $$
where $D(\tau, \vartheta)$ is the Gramian matrix:
$$ D(\tau, \vartheta) = \int_{\tau}^{\vartheta} H(m, \vartheta - t) H^T(m, \vartheta - t) dt \quad (1.10) $$
The condition for successful capture at time $\vartheta_0$ is defined by the equality of the effective resources:
$$ \mu^2(\tau) - \nu^2(\tau) = x^T(\tau) D^{-1}(\tau, \vartheta_0) x(\tau) \quad (1.17) $$
where $x(\tau) = y(\tau) - z(\tau)$ represents the current discrepancy in the predicted terminal positions. If such a $\vartheta_0$ exists, the pursuer can guarantee capture regardless of the evader's actions, provided the evader's total resource does not exceed $\nu(\tau)$.

2. Numerical Example and Simulation

Consider a specific case where the dynamics are defined by second-order integrators:
$$ \dot{y}1 = y_3, \quad \dot{y}_3 = u_1; \quad \dot{y}_2 = y_4, \quad \dot{y}_4 = u_2 \quad (2.1) $$
$$ \dot{z}_1 = z_3, \quad \dot{z}_3 = v_1; \quad \dot{z}_2 = z_4, \quad \dot{z}_4 = v_2 \quad (2.2) $$
The initial conditions at $t=0$ are set as $z_i(0) = 0$, $y_1(0) = y
$. The evader's strategy $v(t)$ is assumed to be zero for $t > t^*$. Under these conditions, the optimal pursuit time $T^0$ is found by solving the transcendental equation (2.4):}$, and $y_3(0) = y_{30
$$ \xi^2 T^3 - 3(x_1 + x_3 T)^2 - 3(x_2 + x_4 T)^2 = 0 \quad (2.4) $$
where $\xi^2 = \mu^2 - \nu^2$. Numerical results for a specific set of parameters (2.11) show that the pursuer successfully reduces the distance to zero at $T^0 = 0.5$. The behavior of the value function and the switching surfaces are illustrated in [FIGURE: 1].

3. Stability and Approximation

In practical applications, the exact values of the resources $\mu$ and $\nu$ may be known only with some error $\epsilon(\tau)$. We introduce a modified strategy $u_\epsilon[t]$ that accounts for these perturbations. Let $\delta = \mu - \nu - \epsilon$. We define the regularized control law:
$$ u_\epsilon[t] = R[\epsilon, \xi] u^0[t] \quad (4.11) $$
where $R[\epsilon, \xi]$ is a smoothing operator that prevents singularities as the resources are depleted. As shown in (4.19), when $\epsilon \to 0$, the regularized strategy converges to the optimal strategy $u^0$. The stability of this approach is verified by constructing a Lyapunov-like function $V(\epsilon, \xi)$ and demonstrating that $dV/dt < 0$ along the trajectories of the system.

4. Conclusion

The proposed feedback control laws (1.19) and (1.21) provide an effective mechanism for pursuit in linear differential games with integral constraints. The inclusion of a regularization parameter $\epsilon$ ensures the robustness of the control system against measurement noise and estimation errors in the resource levels. Further research will focus on extending these results to systems with non-linear dynamics and state constraints.

References

  1. Pontryagin, L. S. On the theory of differential games. Uspekhi Mat. Nauk, 21(4):219–274, 1966.
  2. Isaacs, R. Differential Games. New York: Wiley, 1965.
  3. Krasovskii, N. N. On the problem of pursuit in the case of linear systems. PMM, 30(2):209–225, 1966.
  4. Krasovskii, N. N. and Osipov, Yu. S. On the theory of differential games. Doklady Akad. Nauk SSSR, 173(2):285–287, 1967.
  5. Pshenichnyi, B. N. On the pursuit problem. Kibernetika, (4):3–13, 1965.
  6. Kurzhanskii, A. B. On the approximation of optimal controls in a pursuit problem. Differentsial'nye Uravneniya, 2(5):587–599, 1966.

Submission history

Regularization of certain pursuit problem