MATHEMATICS

I. V. ROMANOVSKII

Submitted 1964-01-01 | RussiaRxiv: ru-196401.02532 | Translated from Russian

Abstract

Full Text

MATHEMATICS

I. V. ROMANOVSKII

ASYMPTOTIC BEHAVIOR OF DYNAMIC PROGRAMMING PROCESSES WITH A CONTINUOUS SET OF STATES

(Presented by Academician V. I. Smirnov on February 13, 1964)

1°. The results set forth in the present note continue the investigations carried out for the case of a finite number of states \((^1)\). They concern the following dynamic programming problem.

Consider a process controlled at discrete moments of time \(t = 0, 1, \ldots, T - 1\), whose state is characterized by a point of a convex closed set \(D\) in \(r\)-dimensional Euclidean space. At each step of the process a transition is possible from any state \(\mathbf{x}\) to any state \(\mathbf{y}\). Under such a transition we receive an income \(K(\mathbf{x}, \mathbf{y}) \geq 0\). At the end of the process, at the point \(\mathbf{x}\), we receive an additional income \(\chi(\mathbf{x})\).

It is required, knowing the initial state of the process \(\mathbf{x}_0\) and the duration of control \(T\), to find a sequence of states \(\mathbf{x}_1, \mathbf{x}_2, \ldots, \mathbf{x}_T\) which would maximize the total income
\[ \sum_{i=0}^{T-1} K(\mathbf{x}_i, \mathbf{x}_{i+1}) + \chi(\mathbf{x}_T). \]

We shall assume the functions \(K(\mathbf{x}, \mathbf{y})\) and \(\chi(\mathbf{x})\) to be continuous.

Denoting the desired maximum for \(\mathbf{x}_0 = \mathbf{x}\) and \(T = t\) by \(f_t(\mathbf{x})\), we obtain in the usual way the following recurrence relation:
\[ f_t(\mathbf{x}) = \max_{\mathbf{y}} [K(\mathbf{x}, \mathbf{y}) + f_{t-1}(\mathbf{y})], \quad t \geq 1; \quad f_0(\mathbf{x}) = \chi(\mathbf{x}). \tag{1} \]

Theorem 1. There exist constants \(\lambda\) and \(L\) such that
\[ |f_t(\mathbf{x}) - t\lambda| \leq L \tag{2} \]
for all \(t\) and \(\mathbf{x}\).

Put
\[ K_1(\mathbf{x}, \mathbf{y}) = K(\mathbf{x}, \mathbf{y}), \]
\[ K_n(\mathbf{x}, \mathbf{y}) = \max_{\mathbf{z}} [K_1(\mathbf{x}, \mathbf{z}) + K_{n-1}(\mathbf{z}, \mathbf{y})]; \tag{3} \]
\[ a_n = \frac{1}{n} \max_{\mathbf{x}} K_n(\mathbf{x}, \mathbf{x}). \tag{4} \]

Theorem 2.
\[ \lambda = \sup_n a_n = \lim_{n \to \infty} a_n. \tag{5} \]

The quantity \(a_n\) is equal to the maximum average income on a cycle of length \(n\) (cf. \((^1)\)), and Theorem 2 says that \(\lambda\) is equal to the least upper bound of the average incomes on cycles.

The role of the problem dual to the maximum-cycle problem is played by the following problem:

Find a function \(z(\mathbf{x})\) and a number \(u\) satisfying the condition
\[ z(\mathbf{x}) - z(\mathbf{y}) + u \geq K(\mathbf{x}, \mathbf{y}) \quad \text{for all } \mathbf{x} \text{ and } \mathbf{y}. \tag{6} \]

Theorem 3. There exist \(z(\mathbf{x})\), \(u\) for which the quantity \(u\) attains its minimum value, and this minimum is equal to \(\lambda\).

2°. Simple examples show that even in a very simple case the maximum in expression (5) may fail to be attained. It is of interest

one special case in which the selection of optimal cycles (i.e., cycles on which the average income is equal to \(\lambda\)) can be carried out rather simply—this is the case of concavity of the function \(K(\mathbf{x}, \mathbf{y})\).

Theorem 4. If the function \(K(\mathbf{x}, \mathbf{y})\) is concave jointly in the variables \(\mathbf{x}\) and \(\mathbf{y}\), then \(\lambda=\alpha_1\).

Corollary 1. If \(K(\mathbf{x}, \mathbf{y})\) is strictly concave, then the unique optimal cycle is the one-step cycle.

Corollary 2. If, for some \(n\), the function \(K_n(\mathbf{x}, \mathbf{y})\) is concave, then \(\lambda=\alpha_n\).

Corollary 3. If, for some \(n\), the function \(K_n(\mathbf{x}, \mathbf{y})\) is concave, then the set of states that can enter into optimal cycles coincides with the set of states at which the maximum of the function \(K_n(\mathbf{x}, \mathbf{x})\) is attained and, consequently, is a closed convex set.

\(3^\circ\). In Theorem 1 it is established that the average income per step of an optimal finite path tends, as the length of this path increases, roughly speaking, to the maximal average income on a cycle. Here it is natural to expect results analogous to Theorem 2 from (1) concerning the approximation of the very trajectory of this path to an optimal cycle. We shall consider the simplest case—the deviation of the trajectory from the unique one-step optimal cycle. A measure of deviation of the states of the process from the optimum will be introduced, possessing natural properties, and it will be shown that the sum of the deviations of the states, taken over the entire duration of the process, is bounded above by a quantity independent of the duration of the process.

Let \(K(\mathbf{x}, \mathbf{y})\) be strictly concave jointly in the variables \(\mathbf{x}\) and \(\mathbf{y}\). Let the maximum of the function \(K(\mathbf{x}, \mathbf{x})\) be attained at the point \(\overline{\mathbf{x}}\in D\). This point will be the optimal state of the process. Through the point \((\overline{\mathbf{x}};\overline{\mathbf{x}}, K(\overline{\mathbf{x}},\overline{\mathbf{x}}))\) of the \((2r+1)\)-dimensional space draw a supporting hyperplane
\[ z=a_0+\mathbf{a}_1\mathbf{x}+\mathbf{a}_2\mathbf{y} \]
to the set \(\{z\in K(\mathbf{x},\mathbf{y})\}\). We note that it can be chosen so that \(\mathbf{a}_2=-\mathbf{a}_1\). In what follows it will be assumed that this has been done. Obviously,
\[ a_0+\mathbf{a}_1\mathbf{x}-\mathbf{a}_1\mathbf{y}>K(\mathbf{x},\mathbf{y}) \quad \text{for }(\mathbf{x},\mathbf{y})\ne(\overline{\mathbf{x}},\overline{\mathbf{x}}); \]
\[ a_0+\mathbf{a}_1\overline{\mathbf{x}}-\mathbf{a}_1\overline{\mathbf{x}} =K(\overline{\mathbf{x}},\overline{\mathbf{x}})=\lambda. \]

As a measure of the deviation of the state \(\mathbf{x}\) from \(\overline{\mathbf{x}}\) we take
\[ l(\mathbf{x})=\min_{\mathbf{y}}\bigl(a_0+\mathbf{a}_1\mathbf{x}-\mathbf{a}_1\mathbf{y}\bigr)-K(\mathbf{x},\mathbf{y}). \tag{7} \]

Obviously, from \(l(\mathbf{x}_n)\to 0\) as \(n\to\infty\) it follows that \(\mathbf{x}_n\to\overline{\mathbf{x}}\).

Theorem 5. If the function \(K(\mathbf{x},\mathbf{y})\) is strictly concave jointly in the variables and \(\overline{\mathbf{x}}\) is the optimal state, then the total deviation of the states of a process of length \(T\), under optimal control \(\mathbf{x}_0^T,\mathbf{x}_1^T,\ldots,\mathbf{x}_T^T\), from the state \(\overline{\mathbf{x}}\), equal to
\[ \sum_{i=0}^{T} l(\mathbf{x}_i^T), \]
is bounded above by a constant \(K\) depending only on \(K(\mathbf{x},\mathbf{y})\) and \(x(\mathbf{x})\).

A more general result can be obtained in the following form:

Theorem 6. Let the function \(K_N(\mathbf{x},\mathbf{y})\) be concave for some \(N=N_0\), and let \(\overline{\mathbf{x}}\) be the set of those \(\mathbf{x}\) for which \(K_{N_0}(\mathbf{x},\mathbf{x})\) attains its maximum value. The total deviation of the states of a process of length \(T\), under optimal control \(\mathbf{x}_0^T,\mathbf{x}_1^T,\ldots,\mathbf{x}_T^T\), from the set \(\overline{\mathbf{x}}\), equal to
\[ \sum_{i=0}^{T} l(\mathbf{x}_i^T), \]
where
\[ l(\mathbf{x})=\min_{\mathbf{y}}\bigl(a_0+\mathbf{a}_1\mathbf{x}-\mathbf{a}_1\mathbf{y}\bigr)-K_{N_0}(\mathbf{x},\mathbf{y}) \quad (l(\mathbf{x})=0 \text{ for } \mathbf{x}\in\overline{\mathbf{x}}), \]

is bounded above by a constant \(K\), depending only on \(K(\mathbf{x}, \mathbf{y})\) and \(\chi(\mathbf{x})\).

\(4^{0}\). We shall now indicate one interesting application of the results presented.

Consider the well-known closed model of developing production with constant technology \(([^2]\), see also \([^3])\).

Let the state of the economic system be specified by a nonnegative vector \(\mathbf{x}=(x_1,\ldots,x_n)\) of quantities of products \(G_1,\ldots,G_n\). These products are processed by means of technological processes. Each process is specified by a pair of nonnegative vectors \((\mathbf{x},\mathbf{y})\), where \(\mathbf{x}=(x_1,\ldots,x_n)\) is the input vector, and \(\mathbf{y}=(y_1,\ldots,y_n)\) is the output vector. Following \([^3]\), we make the assumption that the set of processes satisfies the following conditions:

If \((\mathbf{x},\mathbf{y})\) is a process, then from \(\mathbf{x}=0\) it follows that \(\mathbf{y}=0\).
The set \(Z\) of all technological processes is a closed convex cone.

We shall enlarge the set \(Z\) to \(\widetilde{Z}\), adding to it all processes of the form \((\mathbf{x},\widetilde{\mathbf{y}})\), where the vector inequalities \(0\leq \widetilde{\mathbf{y}}\leq \mathbf{y}\) hold and \((\mathbf{x},\mathbf{y})\in Z\). Note that under such an extension of the set of technological processes it remains a closed convex cone.

In this model, under the assumption that \(Z\) is a polyhedral cone (we shall not need this assumption), Neumann studied the maximal growth rate and its connection with various value indicators under an infinite continuation of the production process.

Another use of this model is possible \([^4]\), connected with a finite time of development. Here the problem is posed, for example, as follows.

An initial state of the system \(\mathbf{x}_0\) and the total number of steps in the model \(T\) are given. Also given are nonnegative values of all products at time \(T\), \(\mathbf{c}=(c_1,\ldots,c_n)\). It is required to find such a control of the system that, in the final state of the system, the sum of the values of the products \(\mathbf{c}\mathbf{x}_T\) be maximal.

In view of the homogeneity of the process, we may normalize its states by requiring, for example, that the state of the process at each time be specified by a nonnegative vector \(\vec{\xi}=(\xi_1,\ldots,\xi_n)\), where \(\sum_i \xi_i=1\), and by remembering the “scale” of the state, i.e., the number \(\rho\) by which the vector \(\vec{\xi}\) must be multiplied in order to obtain the true state of the process \(\mathbf{x}\).

Then to every admissible process \((\mathbf{x},\mathbf{y})\in\widetilde{Z}\) we may assign a number \(K(\mathbf{x},\mathbf{y})\), equal to the logarithm of the increase of the scale of the state,

\(K=\ln \sum y_i/\sum x_i\), and then introduce the function \(K(\vec{\xi},\vec{\eta})\), putting

\[ K(\vec{\xi},\vec{\eta})=\ln \max \sum_i y_i, \]

where the maximum is taken over all such \(\mathbf{y}\) that \((\vec{\xi},\mathbf{y})\in\widetilde{Z}\) and \(\vec{\eta}=\mathbf{y}/\sum y_i\). It is easy to see that this function will be concave.

Further, it is obvious that if we denote by \(f_T(\mathbf{x})\) the logarithm of the maximal value of the products at step \(T\) for the initial state of the process \(\mathbf{x}\), then, using the principle of optimality, we obtain the recurrent relations

\[ f_T(\vec{\xi})=\max_{\vec{\eta}}\,[K(\vec{\xi},\vec{\eta})+f_{T-1}(\vec{\eta})], \]

\[ f_0(\vec{\xi})=\vec{\chi}(\vec{\xi})=\ln \sum_i c_i\xi_i. \tag{8} \]

The use of the theorems stated above leads to the following assertions*:

I. The maximal growth in Neumann’s problem is realized on a one-step cycle, and, consequently, there exists a state in which the maximal uniform growth is attained.

II. Under optimal control in the dynamic problem of production planning with an objective function depending only on the terminal state, the structure of the process \(\vec{\xi}_t\) at the intermediate steps of the process approaches, as \(T \to \infty\), the best structure from Neumann’s problem. Moreover, there exists a constant \(K\), independent of \(T\), such that the sum of appropriately measured deviations of \(\vec{\xi}_t\) from this optimal structure does not exceed \(K\).

III. The average rate of growth in this dynamic problem is asymptotically independent of the vector \(c\) and approaches the maximal rate of growth in Neumann’s problem.

Let us note in conclusion that we have made essential use of the closedness of the model. In an open model (see, for example, \({}^{(5)}\) and the exposition of this work in \({}^{(6)}\)), the question of asymptotics appears more complicated. In particular, there arise “switchings” of structures similar to those obtained in problems of optimal linear control with continuous time \({}^{(7)}\).

Leningrad State University
named after A. A. Zhdanov

Received
6 February 1964

CITED LITERATURE

\({}^{1}\) I. V. Romanovskii, Reports at the VII All-Union Conference on Probability Theory and Mathematical Statistics, Tbilisi, 1963.
\({}^{2}\) J. von Neumann, Rev. Econ. Stud., 13, 1, 1 (1945–1946).
\({}^{3}\) D. Gale, Collection Linear Inequalities, IL, 1959.
\({}^{4}\) L. V. Kantorovich, DAN, 115, No. 3, 441 (1951).
\({}^{5}\) R. Bellman, S. Dreyfus, J. Operat. Res. Soc. Japan, 2, 1, 1 (1958).
\({}^{6}\) I. V. Romanovskii, Collection Mathematical Analysis of Expanded Reproduction, Publishing House of the Academy of Sciences of the USSR, 1961.
\({}^{7}\) R. V. Gamkrelidze, DAN, 126, No. 1, 9 (1957).
\({}^{8}\) T. C. Koopmans, Quart. J. Econ., 78, No. 3, 355 (1964).

* Proofreader’s note. In a recent article by T. Koopmans \({}^{(8)}\) one can find a survey of a number of studies on the same range of questions.

Submission history

[v1] 1964-01-01

Abstract

Full Text

ASYMPTOTIC BEHAVIOR OF DYNAMIC PROGRAMMING PROCESSES WITH A CONTINUOUS SET OF STATES

CITED LITERATURE

Submission history

Access Paper

Citation

Share

Related Papers

Feedback

MATHEMATICS