S. I. ZUKHOVITSKII
1. Let an inconsistent system of linear equations
Submitted 1962-01-01 | RussiaRxiv: ru-196201.03280 | Translated from Russian

Full Text

S. I. ZUKHOVITSKII

ON APPROXIMATING AN INCONSISTENT SYSTEM OF LINEAR EQUATIONS BY THE PRINCIPLE OF MINIMIZING THE SUM OF THE MODULI OF ALL DEVIATIONS

(Presented by Academician N. N. Bogolyubov, 1 XII 1961)

  1. Let an inconsistent system of linear equations

\[ \eta_i(x)\equiv \eta_i=\sum_{j=1}^{n} a_{ij}\xi_j+a_i=0 \qquad (i=1,\ldots,m) \tag{1} \]

be required to be approximated in the best possible way in the sense that the sum of the absolute values of all deviations \(\eta_i(x)\) attain its least value, i.e., it is required to find such a point \(x^*(\xi_1^*,\ldots,\xi_n^*)\) that

\[ \sum_{i=1}^{m} |\eta_i(x^*)|=\min_x \sum_{i=1}^{m} |\eta_i(x)|. \tag{2} \]

The function

\[ z(x)=\sum_{i=1}^{m} |\eta_i(x)| \]

represents, in the \((n+1)\)-dimensional space of the variables \(\xi_1,\ldots,\xi_n,z\), a convex polyhedral surface, and on this surface one must find the lowest-lying point.

In connection with the \(L\)-problem \((^1)\), M. G. Krein indicated, on the basis of mechanical analogies, an elegant algorithm for solving problem (1)—(2) in the case \(n\leqslant 3\). Algorithms for solving our problem have been indicated or can be obtained from the methods presented in \((^{2-4},\,^{6-8})\).

In the present note we give a computational scheme for a finite and monotone algorithm based on Jordan eliminations, which form the basis of Dantzig’s simplex method (see the article \((^5)\) on Jordan eliminations).

  1. Write system (1) in the form of a table

\[ \begin{array}{c|ccccc} & \xi_1 & \xi_2 & \cdots & \xi_n & 1\\ \hline \eta_1= & a_{11} & a_{12} & \cdots & a_{1n} & a_1\\ \cdots & \cdots & \cdots & \cdots & \cdots & \cdots\\ \cdots & \cdots & \cdots & \cdots & \cdots & \cdots\\ \eta_m= & a_{m1} & a_{m2} & \cdots & a_{mn} & a_m \end{array} \tag{3} \]

and, by means of a suitable number of steps of Jordan eliminations, move upward in the table as many deviations \(\eta_i\) as possible. Suppose, for example, that a table of the form* has been obtained

\[ \begin{array}{c|cccccccc} & \eta_1 & \eta_2 & \cdots & \eta_r & \xi_{r+1} & \cdots & \xi_n & 1\\ \hline \eta_{r+1}= & a^{(r)}_{r+1,1} & a^{(r)}_{r+1,2} & \cdots & a^{(r)}_{r+1,r} & 0 & \cdots & 0 & a^{(r)}_{r+1}\\ \cdots & \cdots & \cdots & \cdots & \cdots & \cdots & \cdots & \cdots & \cdots\\ \cdots & \cdots & \cdots & \cdots & \cdots & \cdots & \cdots & \cdots & \cdots\\ \eta_m= & a^{(r)}_{m1} & a^{(r)}_{m2} & \cdots & a^{(r)}_{mr} & 0 & \cdots & 0 & a^{(r)}_m \end{array}, \tag{4} \]

where the expressions for \(\xi_1,\ldots,\xi_r\) have been written separately. Since in what follows \(\xi_{r+1},\ldots,\xi_n\) will not participate, without loss of generality one may assume that it has been possible to move \(n\) deviations upward in the table,

\[ \text{* That is, let the rank of the matrix } \|a_{ij}\| \text{ be } r. \]

so that the new tableau has, for example, the form

\[ \begin{array}{c|ccccc} & \eta_1 & \eta_2 & \ldots & \eta_n & 1\\ \hline \eta_{n+1} = & a_{n+1,1}^{(n)} & a_{n+1,2}^{(n)} & \ldots & a_{n+1,n}^{(n)} & a_{n+1}^{(n)}\\ \ldots & \ldots & \ldots & \ldots & \ldots & \ldots\\ \eta_m = & a_{m1}^{(n)} & a_{m2}^{(n)} & \ldots & a_{mn}^{(n)} & a_m^{(n)} \end{array} \tag{5} \]

At the vertex \(\eta_1=\eta_2=\ldots=\eta_n=0\) (considered in the \(n\)-dimensional space of the variables \(\xi_1,\ldots,\xi_n\)) we shall have
\[ z(x)=\sum_{i=n+1}^{m} |a_i^{(n)}|. \]
If in tableau (5) all the free terms are different from zero, then exactly \(n\) edges emanate from this vertex (we shall call them upper), each of which is formed at the intersection of \(n-1\) of the planes under consideration \(\eta_1=0,\ldots,\eta_n=0\); thus, for example, the intersection of the planes \(\eta_2=0,\ldots,\eta_n=0\) forms an edge, any point of which is determined by the value of the parameter \(\varepsilon\) in the system of equations \(\eta_1=\varepsilon,\ \eta_2=\ldots=\eta_n=0\). If, however, some of the free terms in tableau (5) are equal to zero, then, obviously, the corresponding planes also pass through our vertex, so that in this case, in addition to the \(n\) upper edges, other edges also pass through it, in the formation of which at least one of the planes \(\eta_i=0\), where \(i>n\) and \(a_i^{(n)}=0\), necessarily participates. We shall call the latter edges lateral.

  1. Let us now deal with the characteristic of that upper edge along which (from our vertex) the quantity \(z(x)\) will decrease. Consider, for example, the upper edge

\[ \eta_1=\varepsilon,\quad \eta_2=\ldots=\eta_n=0, \tag{6} \]

where the parameter \(\varepsilon\) will be regarded as sufficiently small. On this edge we obtain

\[ \begin{aligned} z(x) &= \sum_{i=1}^{m} |\eta_i(x)| = |\varepsilon|+\sum_{i=n+1}^{m} |a_{i1}^{(n)}\varepsilon+a_i^{(n)}| \\ &= |\varepsilon|+|\varepsilon|\sum' |a_{i1}^{(n)}|+\sum'' |a_i^{(n)}|+\varepsilon\sum'' \operatorname{sign} a_i^{(n)}\cdot a_{i1}^{(n)} \\ &= \sum'' |a_i^{(n)}|+|\varepsilon|\left\{1+\sum' |a_{i1}^{(n)}|+\operatorname{sign}\varepsilon\cdot \sum'' \operatorname{sign} a_i^{(n)}\cdot a_{i1}^{(n)}\right\}, \end{aligned} \]

where \(\sum'\) is extended over all \(i \ge n+1\) for which \(a_i^{(n)}=0\), and the sum \(\sum''\) over all \(i \ge n+1\) for which \(a_i^{(n)}\ne 0\). Hence it is seen that, when the condition

\[ \left|\sum'' \operatorname{sign} a_i^{(n)}\cdot a_{i1}^{(n)}\right|>1+\sum' |a_{i1}^{(n)}| \tag{7} \]

is fulfilled, and only in this case, the sign of \(\varepsilon\) can be chosen so that
\[ z(x)<\sum_{i=n+1}^{m} |a_i^{(n)}| \]
(namely, one must take
\[ \operatorname{sign}\varepsilon=-\operatorname{sign}\sum'' \operatorname{sign} a_i^{(n)}\cdot a_{i1}^{(n)}). \]

If condition (7) is fulfilled for edge (6), we shall say that this edge satisfies the condition of decrease (of the quantity \(z(x)\)).

Let edge (6) satisfy the condition of decrease, so that, moving from our vertex along this edge in the required direction, i.e. increasing \(|\varepsilon|\) and setting
\[ \operatorname{sign}\varepsilon=-\operatorname{sign}\sum'' \operatorname{sign} a_i^{(n)}\cdot a_{i1}^{(n)}, \]
we shall decrease the quantity \(z(x)\). It is clear that \(|\varepsilon|\) can be increased until one of \(\eta_{n+1}(x),\ldots,\eta_m(x)\) with a free term different from zero becomes zero, i.e. until the edge (6) first meets one of the planes not passing through our vertex, forming with it a new vertex.

To find this new vertex, we proceed as follows: we inspect

the ratios \(\dfrac{a_i^{(n)}}{a_{i1}^{(n)}}\) \((a_{i1}^{(n)} \ne 0;\ i=n+1,\ldots,m)\) and select from them those for which

\[ \operatorname{sign}\frac{a_i^{(n)}}{a_{i1}^{(n)}}=\operatorname{sign}\sum'' \operatorname{sign} a_i^{(n)}\cdot a_{i1}^{(n)} \]

(such ones will necessarily be found); from these latter we choose the one smallest in absolute value, take the corresponding coefficient \(a_{i1}^{(n)}\) as the resolving one, and perform one step of Jordan elimination.

We proceed in the same way from the new vertex, i.e. we find an upper edge satisfying the decrease condition, move along it, and reduce the value \(z(x)\). We continue this process of monotone decrease of \(z(x)\) until we obtain a table in which all upper edges fail to satisfy the decrease condition.

If all free terms in the table obtained are nonzero, i.e. if only \(n\) edges leave our vertex, then the process ends at this point, since there is no edge along which \(z(x)\) decreases, while between edges the function \(z(x)\) is linear; hence the value of \(z(x)\) obtained is the absolute minimum of this function.

If, however, among the free terms of the table obtained there are some equal to zero, so that more than \(n\) edges leave our vertex, then we shall say that degeneracy occurs. In this case it is necessary to check whether \(z(x)\) decreases along any of the lateral edges. To do this, by one step of Jordan elimination we transfer to the top of the table one of the \(\eta_i\) whose free term is zero; this will turn some lateral edges into upper ones, and we check for the decrease condition those upper edges of this new table which have not yet been tested. If any one of them satisfies the decrease condition, then we move along it, decreasing \(z(x)\), as indicated above. If all of them fail to satisfy the decrease condition, then, by one step of Jordan elimination, analogously to the preceding one, we turn another portion of lateral edges into upper ones, check the untested ones among them for the decrease condition, and so on. In the general degenerate case, when \(p\) free terms of table (5) are equal to zero, the maximum number of tests for the decrease condition does not exceed \(C_{n+p}^{p+1}\).

After a finite number of steps we obtain a vertex such that all edges issuing from it fail to satisfy the decrease condition. This vertex will be the desired one, and the sum of the moduli of all free terms of the table obtained will be the desired absolute minimum value of the function \(z(x)\).

  1. The algorithm described can be supplemented somewhat by beginning it not with the exclusion of the coordinates \(\xi_1,\ldots,\xi_n\) in order to obtain table (5) with a random, generally speaking, vertex \(\eta_1=\cdots=\eta_n=0\), at which the function \(z(x)\) may differ considerably from its absolute minimum, but by starting from a point \(x'(\xi_1',\ldots,\xi_n')\) obtained in some way or from an arbitrary approximation to the point \(x^*(\xi_1^*,\ldots,\xi_n^*)\), and then decreasing at each step the quantity \(z(x)\) (until table (5) is obtained).

We have

\[ z(x')=\sum_{i=1}^{m}|\eta_i(x')|=\sum_{i=1}^{m}\operatorname{sign}\eta_i(x')\cdot\eta_i(x'). \]

Consider the function

\[ \eta(x)=\sum_{i=1}^{m}\operatorname{sign}\eta_i(x')\cdot\eta_i(x) =\sum_{j=1}^{n}c_j\xi_j+c, \]

which is the equation of one of the plane faces of the surface

\[ z(x)=\sum_{i=1}^{m}|\eta_i(x)|. \]

In order to reduce \(z(x)\), we shall move from the point \(x'\) along some line toward the plane \(\eta(x)=0\), while not changing the signs of the deviations \(\eta_i(x)\), so that, if some of them become zero, we continue the motion in such a way as not to leave the corresponding planes. For this purpose we construct the table

\[ \begin{array}{cccccc} \xi'_1 & \ldots & \xi'_n & 1\\ \xi_1 & \ldots & \xi_n & 1\\ \hline \eta_1 = & a_{11} & \ldots & a_{1n} & a_1 & \vline\ \eta_1(x')\\ \ldots & \ldots & \ldots & \ldots & \ldots & \vline\ \ldots\\ \ldots & \ldots & \ldots & \ldots & \ldots & \vline\ \ldots\\ \eta_m = & a_{m1} & \ldots & a_{mn} & a_m & \vline\ \eta_m(x')\\ \eta = & c_1 & \ldots & c_n & c & \vline\ \eta(x') \end{array} \tag{8} \]

Let, at the point \(x'\), \(s \geq 0\) deviations vanish, for example
\(\eta_1(x')=\ldots=\eta_s(x')=0\). We shall call the point \(x'\) the \(s\)-th approximation and denote it by \(x_s\). By means of the corresponding number of steps of Jordan eliminations, move upward in the table both \(\eta\) and all the deviations that have vanished (if this can be done). Suppose, for example, that the table obtained is

\[ \begin{array}{cccccccccc} \eta(x_s) & \eta_1(x_s) & \ldots & \eta_s(x_s) & \xi'_{s+2} & \ldots & \xi'_n & 1\\ \eta & \eta_1 & \ldots & \eta_s & \xi_{s+2} & \ldots & \xi_n & 1\\ \hline \eta_{s+1}= & a^{(s)}_{s+1,0} & a^{(s)}_{s+1,1} & \ldots & a^{(s)}_{s+1,s} & a^{(s)}_{s+1,s+2} & \ldots & a^{(s)}_{s+1,n} & a^{(n)}_{s+1} & \vline\ \eta_{s+1}(x_s)\\ \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & \vline\ \ldots\\ \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & \vline\ \ldots\\ \eta_m= & a^{(s)}_{m0} & a^{(s)}_{m1} & \ldots & a^{(s)}_{ms} & a^{(s)}_{m,s+2} & \ldots & a^{(s)}_{mn} & a^{(s)}_{m} & \vline\ \eta_m(x_s) \end{array} \tag{9} \]

We now move along the line
\(\eta=\eta(x_s)-\varepsilon,\ \eta_1=\ldots=\eta_s=0,\)
\(\xi_{s+2}=\xi'_{s+2},\ldots,\xi_n=\xi'_n\), until a new (one or several) \(\eta_i\) first vanishes. To find it, we compute the ratios
\(\eta_i(x_s)/a^{(s)}_{i0}\)
(\(a^{(s)}_{i0}\ne 0;\ i=s+1,\ldots,m\)) and choose among them the least positive one. The deviations for which it is attained we also try to move upward, and, if this can be done, we continue the process until we arrive at a situation where some vanished deviations cannot be moved upward in the table because either there is not a single \(\xi_j\) left at the top, or all coefficients under the remaining \(\xi_j\) in the corresponding rows are equal to zero. The second case will be reduced to the first: if the column of coefficients, for example under \(\xi_{s+2}\), consists entirely of zeros, then we delete it; if, however, in this column there is a coefficient different from zero, then we set
\(\eta=\eta(x_s),\ \eta_1=\ldots=\eta_s=0,\ \xi_{s+2}=\xi'_{s+2}-\varepsilon,\ \xi_{s+3}=\xi'_{s+3},\ldots,\xi_n=\xi'_n\)
(i.e., we move along the level line of the function \(\eta(x)\)), compute the ratios
\(\eta_i(x_s)/a^{(s)}_{i,s+2}\)
(\(a^{(s)}_{i,s+2}\ne 0,\ i=s+1,\ldots,m\)), take the smallest of them in absolute value, and move the newly vanished deviation upward (in place of \(\xi_{s+2}\)). We obtain a table in which at the top there is not a single \(\xi_j\). The column under \(\eta\) necessarily contains a coefficient different from 0, and with the aid of one step of Jordan elimination the table is reduced to the form (5), after which we continue the process as in item 3.

  1. The preceding algorithm can be modified for solving a more general problem: among the solutions of the system of \(p\) linear inequalities

\[ \delta_k(x)=\delta_k=\sum_{j=1}^{n} b_{kj}\xi_j+b_k\geq 0 \quad (k=1,\ldots,p) \]

find one minimizing

\[ \sum_{i=1}^{m}|\eta_i(x)| \]

(the sum of the moduli of all deviations of the system (1)).

Kiev Technological Institute
of the Food Industry

Received
10 XI 1961

CITED LITERATURE

  1. N. Akhiezer, M. Krein, On Certain Questions of the Theory of Moments, Kharkov, 1938.
  2. D. B. Yudin, E. G. Golshtein, Problems and Methods of Linear Programming, Moscow, 1961.
  3. A. Charnes, C. E. Lemke, Naval Res. Logist., 1, 4, 301 (1954).
  4. A. I. Boltyanskii, Reports of the 16th Scientific Conference of the Professional-Teaching Staff of the Leningrad Civil-Engineering Institute, Leningrad, 1958, p. 528.
  5. E. Stiefel, Numer. Math., 2, 1 (1960).
  6. E. G. Golshtein, DAN, 133, No. 3 (1960).
  7. R. P. Chakina, Proceedings of the Ural Electromechanical Institute of Transport Engineers, 2 (1959).
  8. R. P. Chakina, Collection: Investigations on Contemporary Problems of Constructive Function Theory, Moscow, 1961.

Submission history

S. I. ZUKHOVITSKII