Abstract
Full Text
Mathematics
N. N. Vorob’ev
Extremal Algebra of Matrices
(Presented by Academician V. I. Smirnov on 16 III 1963)
Recently there has arisen theoretical and practical interest in extremal problems of a nontraditional character: the set of values assumed by the variable with respect to which the extremization is carried out has in them a rather complicated nature. The most widespread are problems in which this set is either a domain of a multidimensional (and, moreover, very multidimensional!) vector space, or is discrete and is described by relations of a combinatorial character. Problems of the first type are considered in various areas of optimal programming (linear programming, dynamic programming); problems of the second type arise, for example, in graph theory. Connections between problems of both types are known. It suffices to cite as an example the transportation problem in its network formulation.
In view of what has been said, it seems natural to make a systematic study of extremization as a certain operation and of its connection with the basic algebraic operations, addition and multiplication. Such a study has already been undertaken, directly or indirectly, in a number of works (see, for example, \(^{(1-5)}\)).
In the present note an approach to this question is set forth in the spirit of linear algebra. The basic operations on matrices considered here were first studied from an algebraic point of view, apparently, by Pandit \(^{(6)}\). He, however, did not discover the close connection, of the type of duality, between the two extremization operations: maximization and minimization. Therefore he did not manage to go beyond elementary extremal properties of matrices.
1. Extremal operations on matrices. In what follows we shall consider matrices with positive elements. For any matrix \(A\), we shall denote by \(A_i\) the \(i\)-th row of \(A\), by \(A^j\) its \(j\)-th column, and by \(A_{ij}\) the element standing at their intersection.
Put, for \(m \times n\)-matrices \(A\) and \(B\),
\[ (A \,\overline{m}\, B)_{ij}=\max\{A_{ij},B_{ij}\}\quad(\text{maximization}), \]
\[ (A \,\underline{m}\, B)_{ij}=\min\{A_{ij},B_{ij}\}\quad(\text{minimization}), \]
and, for an \(m \times n\)-matrix \(A\) and an \(n \times p\)-matrix \(B\),
\[ (A \,\overline{\circ}\, B)_{ij}=\max_{1\le k\le n} A_{ik}B_{kj}\quad(\text{maximal multiplication}), \]
\[ (A \,\underline{\circ}\, B)_{ij}=\min_{1\le k\le n} A_{ik}B_{kj}\quad(\text{minimal multiplication}). \]
For brevity, we shall carry out arguments that are valid equally for maximization and minimization of matrices using the term extremization and denoting it by the sign \(\widetilde{m}\). Similarly, we shall speak of extremal multiplication of matrices, denoting it by the sign \(\widetilde{\circ}\). Here we use multiplicative terminology as the more customary one. In view of the monotonicity of the logarithm, we may pass, if this proves convenient, to an additive system of notation.
2. Basic properties of extremal operations.
\(1^\circ.\) \(A \,\overline{m}\, A = A.\)
\(2^\circ.\) \(A \,\overline{m}\, B = B \,\overline{m}\, A.\)
\(3^\circ.\) \(A \,\overline{m}\, (B \,\overline{m}\, C) = (A \,\overline{m}\, B) \,\overline{m}\, C.\)
\(4^\circ.\) \(A \,\overline{o}\, (B \,\overline{o}\, C) = (A \,\overline{o}\, B) \,\overline{o}\, C.\)
What has been said gives us grounds to introduce the notations
\[ \mathop{\overline{m}}_{k=1}^{n} A^{(k)}, \qquad \mathop{\overline{o}}_{k=1}^{n} A^{(k)}, \qquad A^{\overline{o} n} \]
respectively for the extremization and extremal product of several matrices and for the extremal power of a matrix.
\(5^\circ.\) \(A \,\overline{o}\, (B \,\overline{m}\, C) = (A \,\overline{o}\, B) \,\overline{m}\, (A \,\overline{o}\, C).\)
\(6^\circ.\) \((A \,\overline{m}\, B) \,\overline{o}\, C = (A \,\overline{o}\, C) \,\overline{m}\, (B \,\overline{o}\, C).\)
\(7^\circ.\) \(\lambda (A \,\overline{o}\, B) = \lambda A \,\overline{o}\, B = A \,\overline{o}\, \lambda B \quad (\lambda > 0).\)
\(8^\circ.\) \(\lambda (A \,\overline{m}\, B) = \lambda A \,\overline{m}\, \lambda B \quad (\lambda > 0).\)
\(9^\circ.\) \(A \,\overline{o}\, (B \,\underline{o}\, C) \leq (A \,\overline{o}\, B) \,\underline{o}\, C.\)
\(10^\circ.\) \(A \,\underline{o}\, (B \,\overline{o}\, C) \geq (A \,\underline{o}\, B) \,\overline{o}\, C.\)
In items \(9^\circ\) and \(10^\circ\), the inequalities are understood elementwise. Let us note that these inequalities reflect the minimax inequality, which is important in game theory.
Since all elements of the matrix \(A\) are positive, one may speak of the matrix \(A^{-}\), extremally inverse to \(A\):
\[ (A^{-})_{ij}=\frac{1}{A_{ji}}. \]
The operation of extremal inversion has the following properties:
\(11^\circ.\) \(A^{--}=A.\)
\(12^\circ.\) \((A \,\overline{o}\, B)^{-}=B^{-}\,\underline{o}\, A^{-}.\)
\(13^\circ.\) \((A \,\underline{o}\, B)^{-}=B^{-}\,\overline{o}\, A^{-}.\)
\(14^\circ.\) \((A \,\overline{m}\, B)^{-}=A^{-}\,\underline{m}\, B^{-}.\)
\(15^\circ.\) \((A \,\underline{m}\, B)^{-}=A^{-}\,\overline{m}\, B^{-}.\)
\(16^\circ.\) If \(A \,\overline{o}\, B=C\), then \(B \leq A^{-}\,\underline{o}\, C\) and \(A \leq C \,\underline{o}\, B^{-}.\)
\(17^\circ.\) If \(A \,\underline{o}\, B=C\), then \(B \geq A^{-}\,\overline{o}\, C\) and \(A \geq C \,\overline{o}\, B^{-}.\)
3. Several practical problems formulated in terms of extremal operations.
I. Suppose that \(n\) types of raw material \(C_1,\ldots,C_n\) can be used in any of \(m\) production processes \(\Pi_1,\ldots,\Pi_m\), yielding respectively the finished products \(P_1,\ldots,P_m\). We shall assume that, using raw material \(C_j\) in process \(\Pi_i\), from each unit of \(C_j\), given a sufficient quantity of all the other types of raw material, we obtain \(A_{ij}\) units of product \(P_i\). We shall also assume that, for each process to proceed, the availability of all types of raw material is necessary and sufficient. Determine the minimum quantities \(X_j\) of raw material \(C_j\) such that, by starting any of the processes \(\Pi_i\), one obtains as a result exactly \(B_i\) units of product \(P_i\).
Obviously, the vector \(X\) of the required quantities of raw material must satisfy the equation \(A \,\overline{o}\, X=B\).
II. A certain technical system consists of \(n\) units, each of which is characterized by one parameter \(X_j\) \((j=1,\ldots,n)\). Suppose that the system is to operate in any of \(m\) modes, and that the load borne by unit \(j\) under the conditions of mode \(i\) is proportional to the value of the unit’s parameter, with proportionality coefficient \(A_{ij}\).
In order that, in mode \(i\), each unit of the system be capable of withstanding the load \(B_i\), it is necessary that \(A \,\underline{o}\, X \geq B\).
III. A certain system may be in one of \(m\) states \(C_1,\ldots,C_m\). Transitions of the system from one state to another may occur at discrete time instants \(1,2,\ldots,n\), and the transition at time \(t\) from \(C_i\) to \(C_j\) requires a cost \(a_{ij}^{(t)}\).
Let at time 0 the system be in state \(C_i\). How should the states of the system be changed at times \(1, 2, \ldots, n\) so that at time \(n\) it is in state \(C_j\), and the total costs are minimal?
Obviously, the minimal total costs are equal to
\[ \log \left( \prod_{t=1}^{n} A^{(t)} \right)_{ij}, \quad \text{where } A_{ij}^{(t)} = e^{a_{ij}^{(t)}} . \]
It is clear that, for problems of this kind, the additive notation is preferable.
4. Solvability of extremal equations.
Consider the equation \(A \overset{\circ}{\times} X = B\). Denote by \(S_j\) \((j = 1, \ldots, n)\) the set of all such rows \(s\) of the \(m \times n\)-matrix \(A\) that
\[ \frac{A_{sj}}{B_s} = \max_{i=1}^{m} \frac{A_{ij}}{B_i}. \]
Theorem 1. In order that the equation \(A \overset{\circ}{\times} X = B\) be solvable, it is necessary and sufficient that the sets \(S_1, \ldots, S_n\) form a cover of the set of all rows of the matrix \(A\).
A set of vectors \(\mathfrak{X}\) is called extremally convex if from \(X^{(1)}, X^{(2)} \in \mathfrak{X}\) it follows that \(\lambda_1 X^{(1)} \overset{m}{=} \lambda_2 X^{(2)} \in \mathfrak{X}\) for any \(\lambda_1, \lambda_2 \ge 0\) for which \(\lambda_1 \overset{m}{=} \lambda_2 = 1\).
Theorem 2. The set of all solutions of the equation \(A \overset{\circ}{\times} X = B\) is extremally convex. It can be described effectively.
5. Extremal eigenvalues and eigenvectors.
If, for a square matrix \(A\) and a vector \(X\), the equality \(A \overset{\circ}{\times} X = \lambda X\) holds, then \(\lambda\) is called an extremal eigenvalue of the matrix \(A\), and \(X\) its extremal eigenvector.
Theorem 3. Every square \(m \times m\)-matrix \(A\) has a unique extremal eigenvalue \(\underline{\lambda}\), and
\[ \underline{\lambda} = \max \sqrt[k]{A_{i_1 i_2} A_{i_2 i_3} \cdots A_{i_k i_1}}, \tag{*} \]
where the extremization is performed over all \(1 \le k \le m\) and all cyclic sequences of numbers \(i_1, i_2, i_3, \ldots, i_k, i_1\) from \(1, 2, \ldots, m\).
Cycles for which the extremum in \((*)\) is attained are called extremal eigen-cycles of the matrix \(A\).
A set \(\mathfrak{X}\) is called an extremally convex cone if from \(X^{(1)}, X^{(2)} \in \mathfrak{X}\) it follows that \(\lambda_1 X^{(1)} \overset{m}{=} \lambda_2 X^{(2)} \in \mathfrak{X}\) for any \(\lambda_1, \lambda_2 \ge 0\).
Theorem 4. The set of extremal eigenvectors of any matrix is a nonempty extremally convex cone.
For computing the extremal eigenvalues of a matrix one may use the relation
\[ \underline{\lambda} = \max_{1 \le i,\, k \le m} \sqrt[k]{\left(A^{\overset{\circ}{-}k}\right)_{ii}} . \]
Analysis of the process of obtaining the extrema in the last equality leads to the determination of all extremal cycles of \(A\). In addition, this can give us all ratios of components of extremal eigenvectors of the form \(X_s : X_r\), if \(r\) and \(s\) belong to one and the same extremal eigen-cycle.
Let the ratios \(X_s : X_r = \alpha_s\), for fixed \(r\), be defined for some \(s\). We shall understand the equality \(A \overset{\circ}{\times} X = \lambda X\) as the system of relations
\[ A_i \overset{\circ}{\times} X = \lambda X_i \quad (i = 1, \ldots, m). \]
Since for proper extremal vectors \(X_s=a_sX_r\), we can eliminate all variables \(X_s\) (except, of course, \(X_r\) itself) from these relations. Replacing all extremized terms \(A_{is}X_s\) by \(\underset{s}{\overset{m}{\bar m}} A_{is}a_sX_r\)
(which in our case is an exact analogue of the reduction of like terms) and replacing all relations of the form \(A_sX=\lambda a_sX_r\) by one:
\[ \overset{m}{\underset{l=1}{\bar m}} \left( A_{rl}X_r \,\bar m\, \left( \underset{s}{\bar m}\frac{A_{sl}}{a_s} \right)X_l \right) =\lambda X_r, \]
we can reduce the original problem to the solution of an equation of the form
\(A^* \,\bar\circ\, X=\lambda X\), where all extremal proper cycles \(A^*\) are monomial. (A matrix \(A^*\) possessing this property will be called a loop matrix.)
Theorem 5. Whatever the loop matrix \(A\) may be, there exists a natural number \(r\) such that
\[ A^{\bar\circ\, r+1}=\lambda A^{\bar\circ\, r}. \]
This number \(r\) (the extremal order of \(A\)) is effectively estimated from above and can in every case be found.
It is clear that the columns of the matrix \(A^{\bar\circ\, r}\) are extremal proper vectors of \(A\). It is not difficult to show that the extremally convex cone spanned by them is the set of all extremal proper vectors of \(A\).
Leningrad Branch
of the V. A. Steklov Mathematical Institute
Academy of Sciences of the USSR
Received
5 III 1963
REFERENCES
- W. Fenschel, Lecture Notes, Princeton, 1953.
- R. Bellman, Quart. Appl. Math., 16, 87 (1958).
- R. Bellman, W. Karush, Bull. Am. Math. Soc., 67, 501 (1961).
- R. Bellman, W. Karush, J. Soc. Industr. Appl. Math., 10, 550 (1962).
- I. V. Romanovskii, Vestn. LGU, No. 13, 148 (1962).
- S. N. Pandit, J. Soc. Industr. Appl. Math., 9, 632 (1961).