UDC 518.9
MATHEMATICS
Submitted 1969-01-01 | RussiaRxiv: ru-196901.98522 | Translated from Russian

Abstract

Full Text

UDC 518.9

MATHEMATICS

Yu. A. FLEROV

MULTILEVEL DYNAMIC GAMES

(Presented by Academician A. A. Dorodnitsyn, 30 XII 1968)

Clarifying the fundamental possibilities for controlling complex systems is an urgent problem. Examples of complex systems in physiology, automata theory, planning problems, and economic regulation (within a country or a large branch of industry) are numerous. The most important feature of such systems is their multilevel character and their evident decomposition into more or less autonomous subsystems. From the point of view of mathematical formalization, existing examples of the reflection of this feature are decomposition methods in linear programming \((^1)\), two-level game models of planning (the “Hungarian method” \((^2)\)), and the interaction of automata \((^3)\) and automata with variable structure \((^4)\).

Multilevel dynamic games (m.d.g.) provide a general concept that makes it possible to formalize the autonomy of subsystems and the hierarchical structure of the entire system.

  1. The tree of an m.d.g. is a finite tree \(D\) with a distinguished vertex \(O\). The vertices of the tree \(D\) will be called positions. Let \(k\) and \(l\) be positions of \(D\); \(l < k\) if \(k\) lies on the non-self-intersecting path connecting \(O\) with \(l\) \((l \ne k)\). The rank \(R(k)\) of a position \(k\) is the number \(r\) of such positions \(l\) that \(k \le l\), \(R(k)=\{\,| \{l\}| : l \in D;\ l \le k\,\}\). The set \(U_r=\{k \in D: R(k)=r\}\) of positions of one rank is the level of rank \(r\), or simply the \(r\)-th level. A position \(l \in D\) is subordinate to a position \(k \in D\) if \(l < k\) and there is no \(r \in K\), distinct from \(l\) and \(k\), such that \(l < r < k\).

Let us describe the structure of a position \(k \in D\). With each position we associate a game \(\Gamma^{(k)}=\langle I^{(k)};\ \{\gamma_i\}^{(k)},\ i=1,2,\ldots,n_k\rangle\), specified by the set of players \(I^{(k)}\) and the set of possible states of the game \(\{\gamma_i\}^{(k)}\). To a state
\[ \gamma_i=\langle \{X_j(i),\ j \in I^{(k)}\};\ \{h^{(k)}(i)\}\rangle \]
of the game \(\Gamma^{(k)}\) there correspond various sets of pure strategies of each player \(X_j^{(k)}(i)\), \(j \in I^{(k)}\), and his payoff function \(h_j^{(k)}(i)\), \(j \in I^{(k)}\), mapping the set of possible situations in pure strategies
\[ \bar{x}=(x_{j_1},\ldots,x_{j_{r_k}})\in \prod_{j\in I^{(k)}} X_j^{(k)}(i) \]
into the set of real numbers.

  1. An m.d.g. proceeds in discrete time \((t=0,1,2,\ldots)\), and at each moment games are played in all positions of one level. Suppose that at time \(t\) games take place in positions of the level of rank \(r\), which is not the level of maximal rank. Then the pure strategies chosen by the players of each position determine their payoff and, in addition, the probability of transition to a new state of the games of positions of rank \(r+1\) subordinate to the given position. This probability for each position is determined by a function \(P_{kl}(\bar{x}^{(k)};\gamma_i^{(l)})\), defined for all positions \(l\) of level \(r+1\) subordinate to position \(k\) of level \(r\), and being a probability distribution on the set of states \(\{\gamma_i\}^{(l)} \cup \gamma^{(0)}\) for fixed \(\bar{x}^{(k)}\). Transition of a game to the state \(\gamma^{(0)}\) means its termination. We require that the lower bound, over all variables and all positions, of the probability of termination of the game be greater than \(\alpha,\ \alpha>0\).

A multilevel dynamic game begins at rank level 1 (position 0) and continues in the manner described above until the level of maximal rank is reached (the forward course of the play), after which the reverse course of the play begins: positions of maximal rank determine new states of the games of the higher positions by means of a function \(O(\bar{x}^{(k_1)}, \bar{x}^{(k_2)}, \ldots, \bar{x}^{(k_j)}, \psi_l^t)\), which depends on the state of the game in the given higher position \(l\) and on the situations \(\{\bar{x}^{(k)}\}\) in all positions subordinate to \(l\), and which determines a probability distribution on the set of possible states of the games of position \(l\). The play ends when level 0 is reached. A multilevel dynamic game consists of infinitely repeated plays and terminates when the game terminates in at least one position. A player’s payoff accumulates over the course of the game.

Of greatest interest are the following three classes: 1) cyclic games: the forward course of the play coincides with that described earlier, but after positions of maximal rank are reached, all positions that have no subordinates determine a new state of position 0, and the play ends; 2) simultaneous games: there is neither a forward nor a reverse course of the play; the games of all positions take place simultaneously; 3) sequential games: after the level of maximal rank is reached, a new state of the games of the higher level is determined, and so on until position 0 is reached. Multilevel dynamic games generalize the construction of stochastic games proposed by Shapley \((^5)\).

  1. Let us define the set of admissible strategies of players in a multilevel dynamic game. For simplicity we restrict ourselves to the case in which all states of all positions are finite games. A pure strategy of the player of position \(k\) is a function \(F_{kt}(\gamma_0, x_0, \gamma_1, x_1, \ldots, \gamma_{t-1}, x_{t-1}, \gamma_t)\), where \(\gamma_t\) is the state of the game at step \(t\); \(x_t\) is the player’s choice at time \(t\); \(F_{kt} \in \{0,1\}\); \(\sum F_{kt} = 1\); \(F_{kt} = 1\) if the player chooses pure strategy \(k\) at time \(t\). A mixed strategy is a probability distribution on the set of pure strategies. A behavior strategy \((^6)\) is characterized by the property that the player performs an independent randomization of his actions at each step instead of making a random choice of a pure strategy.

A behavior strategy is defined by a set of functions
\[ F_{kt}(\gamma_0, x_0, \gamma_1, x_1, \ldots, \gamma_{t-1}, x_{t-1}, \gamma_t), \quad F_{kt} \ge 0,\quad \sum_k F_{kt}=1. \]

A multilevel dynamic game is a game with perfect recall \((^{6,7})\), and Aumann’s theorem \((^7)\) on the equivalence of strategies is applicable to such games.

Theorem 1. In a multilevel dynamic game, every mixed strategy has an equivalent behavior strategy.

Within the class of behavior strategies it is reasonable to single out stationary behavior strategies—behavior strategies that depend only on the state of the game in which the player participates, i.e., at any moment of time, in one and the same state, the player uses identical probability distributions over his actions.

Theorem 2. In simultaneous and cyclic multilevel dynamic games, every mixed strategy has an equivalent stationary behavior strategy.

  1. Let us examine the existence of equilibrium points \((^{8-10})\) in multilevel dynamic games.

Theorem 3. Cyclic and simultaneous multilevel dynamic games in which the states of the games of each position are finite games or infinite games with a compact strategy space for each player and a continuous payoff function have an equilibrium point in the class of stationary behavior strategies of the players.

The theorem is proved by reducing the multilevel dynamic game to the stochastic game \(\bar{I} = \bigcup_{k \in D} I^{(k)}\) of the players and by using the results from \((^{7,8})\) on the existence of equilibrium points in stochastic \(n\)-person games.

Theorem 4. In a sequential multilevel dynamic game there exists an equilibrium point in the class of stationary behavior strategies of the players.

The proof of this theorem consists in constructing the positional form of the multilevel dynamic game and the corresponding stochastic finitary form \((^{11})\),

after which Isbell’s theorem \(^{11}\) on the existence of equilibrium situations in stationary strategies is used.

Let us note that the proofs of theorems \(^{3,4}\) are nonconstructive.

Computing Center
Academy of Sciences of the USSR
Moscow

Received
30 XII 1968

CITED LITERATURE

\(^{1}\) E. G. Golshtein, D. B. Yudin, New Directions in Linear Programming, Moscow, 1966.
\(^{2}\) J. Kornai, Th. Lipták, Econometrica, 33, No. 1, 141 (1965).
\(^{3}\) V. I. Bryzgalov, I. I. Pyatetskii-Shapiro, M. L. Shik, DAN, 160, No. 5, 1039 (1965).
\(^{4}\) G. A. Agasandyan, in the collection Investigations in the Theory of Self-Tuning Systems, Transactions of the Computing Center of the Academy of Sciences of the USSR, Moscow, 1967, p. 138.
\(^{5}\) L. S. Shapley, Proc. Nat. Acad. Sci. U.S.A., 39, No. 10, 1095 (1953).
\(^{6}\) H. W. Kuhn, in the collection Positional Games, Moscow, 1967, p. 13.
\(^{7}\) R. J. Aumann, ibid., p. 251.
\(^{8}\) J. Nash, in the collection Matrix Games, Moscow, 1961, p. 105.
\(^{9}\) A. M. Fink, J. Sci. Hiroshima Univ., Ser. A-1, 28, 89 (1964).
\(^{10}\) M. Takahashi, ibid., p. 95.
\(^{11}\) J. R. Isbell, in the collection Positional Games, Moscow, 1967, p. 132.

Submission history

UDC 518.9