Full Text
UDC 512.25
MATHEMATICS
B. A. SHCHENNIKOV
THE AGGREGATION METHOD FOR SOLVING A SYSTEM OF LINEAR EQUATIONS
(Presented by Academician B. N. Petrov on 3 VI 1966)
Consider the system of linear equations
\[ x = Ax + B,\qquad A=(a_{ij})_1^n,\quad B=(b_i)_1^n. \tag{1} \]
In the present work special aggregation methods are investigated, making it possible to represent the process of solving system (1) in the form of a sequential process of solving systems of smaller dimension. Let \(x=(x_1\ldots x_n)\) be the solution of system (1) and \(X_k=\sum_{i=1}^k x_i\ne 0\) \((k\le n)\). Denote
\[ p=(p_1\ldots p_k),\qquad p_i=x_i/X_k\quad (i\le k);\qquad X_{k+1}=x_{k+1},\ldots,X_n=x_n;\qquad X=(X_k\ldots X_n). \]
Introduce the matrix \(\bar A_p\) and the vector \(\bar B\)
\[ A_p= \begin{pmatrix} \displaystyle\sum_{i,j=1}^{k} a_{i,j}p_j & \displaystyle\sum_{i=1}^{k} a_{i,k+1} & \cdots & \displaystyle\sum_{i=1}^{k} a_{i,n}\\[1.2em] \displaystyle\sum_{j=1}^{k} a_{k+1,j}p_j & a_{k+1,k+1}\cdots & a_{k+1,n}\\ \cdot\ .\ .\ .\ .\ .\ .\ .\ .\ .\ .\ .\ .\ .\ .\ .\ .\ .\\ \displaystyle\sum_{j=1}^{k} a_{n,j}p_j & a_{n,k+1}\cdots & a_{n,n} \end{pmatrix}, \qquad \bar B= \begin{pmatrix} \displaystyle\sum_{i=1}^{k} b_i\\ b_{k+1}\\ \vdots\\ b_n \end{pmatrix} \]
and put in correspondence with system (1) the system
\[ X=\bar A_pX+\bar B, \tag{2} \]
whose solution is the vector \(X\) defined above.
The transition from system (1) to system (2) arose in the analysis of planning problems in economics, and system (2) itself received the name of the aggregated system of equations of the intersectoral balance. This transition has to be carried out in practice for a number of reasons, one of which is the large dimension of the original system (1) (for example, in real problems the number of variables \(x_i\) may be measured in millions). It is clear how, from the solution of system (2), to obtain also the solution of system (1). However, in order to construct (2) it is necessary to know the vector \(p\).
The vector \(p\), in turn, is determined from the solution of the original problem. Usually, in the practice of analyzing models of intersectoral balance, the indicated difficulties are bypassed by setting \(p=p^0\), where
\[ p^{(0)}=\left\{(p_i^{(0)})_1^k\mid p_i^{(0)}=x_i^{(0)}\Big/\sum_i x_i^{(0)}\right\}, \]
and \(x_i^{(0)}\) is the value that the output of product \(i\) actually assumed in some reporting period in the past.
Instead of (2), one solves the system
\[ X^{(1)}=\bar A_{p(0)}X^{(1)}+\bar B . \tag{3} \]
Here it remains unclear how the solution of this system—the vector \(X^{(1)}\)—is related to the vectors \(X\) and \(x\). In the present paper, the regularity of these practical methods is established in a certain sense.
Thus, suppose that the vector \(X\) is known. Define, for all \(i \leqslant k\),
\[ x_i=\left(\sum_{j=1}^{k} a_{ij}p_j\right)X_k+ \left(\sum_{j=k+1}^{n} a_{ij}X_j+b_i\right). \tag{4} \]
From (4) it follows that
\[ p_i=\sum_{j=1}^{k} a_{ij}p_j+ \frac{\sum_{j=k+1}^{n} a_{ij}X_j+b_i}{X_k}. \tag{5} \]
The ratio \(\left(\sum_{j=k+1}^{n} a_{ij}X_j+b_i\right)/X_k\) depends linearly on \((p_1\ldots p_k)\); consequently,
\[ p=Cp,\qquad C=\{(c_{st})_1^k\mid c_{st}=a_{st}+r_{st}\}. \tag{6} \]
Let us determine the elements \(r_{st}\). Let \(p=p(t)=(0\ldots010\ldots0)\), and \(X(p(t))\) be the solution of system (2), where \(\bar A_p=\bar A_{p(t)}\). Then
\[ r_{st}= \left[ \sum_{j=k+1}^{n} a_{sj}X_j(p(t))+b_s \right]\big/ X_k(p(t)). \tag{7} \]
Consequently, to determine \(r_{st}\) one must solve system (2) \(k\) times with matrices \(\bar A_{p(t)}\) that differ in their first column. Knowing the solution of system (6), it is easy to obtain the solution of system (2), and consequently also of system (1).
Thus, what has been said constitutes a finite aggregation method for solving system (1).
Let us now consider the iterative aggregation process. Let
\[ X^{(N+1)}=\bar A_{p(N)}X^{(N+1)}+\bar B, \]
\[ x_i^{(N+1)}= \left(\sum_{j=1}^{k} a_{ij}p_j^{(N)}\right)X_k^{(N+1)} +\sum_{j=k+1}^{n} a_{ij}X_j^{(N+1)}+b_i,\qquad i\leqslant k, \tag{8} \]
\[ p_j^{(N)}=x_j^{(N)}/X_k^{(N)},\qquad j\leqslant k. \]
The matrix \(C\), defined above, does not enter explicitly into this process; nevertheless, it is obvious that the convergence of the iterative aggregation process is determined by its properties.
Theorem 1, containing sufficient conditions for convergence of the iterative aggregation process, holds.
Theorem 1. Suppose the matrix \(C=(c_{st})_1^k\geqslant0\) is irreducible and aperiodic; then the iterative process converges at the rate \(\lambda^N\), where
\[ \lambda=\{\max|\lambda_i|<1\mid \det|C-\lambda_iE|=0\}. \]
The matrix \(C\), defined in (6), for \(c_{st}\geqslant0\) turns out to be stochastic \((\sum c_{st}=1)\), and the fixed vector \(p\) is determined by the convergent sequence \(\{p^{(N)}\}\)
\[ p^{(N)}=Cp^{(N-1)}=C^Np^{(0)}. \tag{9} \]
The iterative process takes an especially simple form when \(k=n\), i.e., when at each iteration step one corresponding one is assigned to system (1)
scalar equation
\[ X^{(N+1)}=\left(\sum_{i,j=1}^{n} a_{ij}p_j^{(N)}\right)X^{(N+1)}+b, \]
\[ b=\sum_{i=1}^{n} b_i,\qquad p_j^{(N)}=x_j^{(N)}/X^{(N)}=x_j^{(N)}/\sum_{j=1}^{n}x_j^{(N)}, \tag{10} \]
\[ x_i^{(N+1)}=\left(\sum_{j=1}^{n}a_{ij}p_j^{(N)}\right)X^{(N+1)}+b_i. \]
For this process,
\[ C=\left\{(c_{ij})_1^n\mid c_{ij}=a_{ij}+\frac{b_i}{b}\left(1-\sum_i a_{ij}\right)\right\}. \]
Sufficient conditions for the semipositivity of the matrix \(C\) are, for example,
\[ \|A\|=\max_j \sum_i a_{ij}\leqslant 1;\qquad a_{ij}\geqslant 0,\quad b_i\geqslant 0, \]
and the iterative process itself is expedient to use if the norm \(\|A\|\) is close to unity.
If \(C=(c_{st})_1^k\) contains a positive row \(i\), then one can make a rough estimate of the rate of convergence
\[ |\lambda|\leqslant 1-\gamma_k,\qquad \gamma_k=\min_{j\leqslant k}c_{ij}. \]
If \(l\) variables of system (1) are aggregated, then \(\gamma_l\leqslant\gamma_k\), if \(l\leqslant k\). Generally speaking, although the upper bound \(\gamma_k\) for the rate of convergence of the iterative aggregation process decreases as the number of aggregated variables decreases, this decrease may be insignificant.
Convenient from the computational point of view is the iterative aggregation process when the matrix \(\bar A_p\) acquires a triangular form. In this case the matrix \(A\) can be reduced to block-triangular form.
Let \(J_j=\{k_{j-1}+1\ldots k_j\}\) be subsets of the set
\[ N=\{1,2,\ldots,n\},\qquad j=1,2,\ldots,l;\qquad k_0=0,\quad k_l=n;\qquad \bigcup J_j=N. \]
At each step of the iterative process we aggregate the variables \(x_t,\ t\in J_j\), for any \(j=1,2,\ldots,l\), into a single \(\bar X_j\). As a result, to each strip of the matrix \(A\) (strip \(i\) consists of the elements \(a_{st},\ s\in J_i,\ t\leqslant k_i;\ i=1,2,\ldots,l\)) there will correspond one equation. In this case the aggregated system
\[ \bar X^{(N+1)}=\bar A_{p(N)}\bar X^{(N+1)}+\bar B \tag{11} \]
is triangular. Here
\[ \bar B=\left\{(\bar b_i)_1^l\mid \bar b_i=\sum_{s\in J_i}b_s\right\}, \]
\[ \bar A_{p(N)}=\left\{(\bar a_{ij}^{(N)})_1^l\mid \bar a_{ij}^{(N)}=\sum_{s,t}a_{st}p_t^{(N)},\quad p_t^{(N)}=x_t^{(N)}/X_j^{(N)};\quad s\in J_i,\ t\in J_j\right\}. \]
Define, for all \(s\in J_i,\ i=1,2,\ldots,l\),
\[ x_s^{(N+1)}=\sum_{j\leqslant i}a_{st}p_t^{(N)}X_t^{(N+1)}+b_s,\qquad t\in J_j. \tag{12} \]
Let \(x=(x_1\ldots x_n)\) be a solution of (1); \(p_t=x_t/X_j\). Denote
\[ C_j=\{(c_{st})_{s,t\in J_j}\mid c_{st}=a_{st}+r_s\beta_t\}, \]
\[ r_s=\sum_{j<i}\left[a_{st}p_tX_j+b_s\right]\Bigg/ \sum_{s\in J_i,\ j<i}\left[a_{st}p_tX_j+\bar b_i\right], \qquad \beta_t=1-\sum_{s\in J_i}a_{st}. \]
In the variables \(p_t\), the process just described is, in contrast to (9), nonstationary (see (5), Ch. III).
Theorem 2 establishes the convergence of this iterative process.
Theorem 2. Let \(C_j \geq 0\) be indecomposable and aperiodic matrices for any \(j = 1, 2, \ldots, l\). Then the iterative process (11), (12) converges with rate \(\lambda^N\), where \(\underline{\lambda} \leq \lambda \leq \overline{\lambda}\)
\[ \overline{\lambda} = \left\{ \max_j \max_i |\lambda_{ij}| < 1 \ \middle|\ \det |C_j - \lambda_{ij} E| = 0 \right\}, \]
\[ \underline{\lambda} = \left\{ \min_j \max_i |\lambda_{ij}| < 1 \ \middle|\ \det |C_j - \lambda_{ij} E| = 0 \right\}. \]
Let us note that the conditions ensuring the convergence of the iterative aggregation processes are satisfied when system (1) is a balanced economic model. The assumption of convergence of iterative processes analogous to those presented above was first put forward in a work on economics [1]. Works [2–4] are devoted to the mathematical investigation of the question.
Moscow State University
named after M. V. Lomonosov
Received
23 V 1966
REFERENCES
- L. M. Dudkin, E. B. Ershov, Planovoe khozyaistvo, No. 5 (1965).
- V. A. Volkonskii, Ekonomika i matematicheskie metody, 2, issue 2 (1966).
- B. A. Shchenikov, Ekonomika i matematicheskie metody, 1, issue 6 (1966).
- B. A. Shchenikov, Ekonomika i matematicheskie metody, 2, issue 5 (1966).
- D. K. Faddeev, V. N. Faddeeva, Computational Methods of Linear Algebra, Moscow—Leningrad, 1960.