Full Text
MATHEMATICS
S. I. ZUKHOVITSKII
ON A NEW NUMERICAL SCHEME OF AN ALGORITHM FOR CHEBYSHEV APPROXIMATION OF AN INCONSISTENT SYSTEM OF LINEAR EQUATIONS AND A SYSTEM OF LINEAR INEQUALITIES
(Presented by Academician N. N. Bogolyubov on 13 III 1961)
- In papers \((^{1-3})\) a finite and monotone algorithm was constructed for the Chebyshev approximation of an inconsistent system of linear equations
\[ \eta_i \equiv \eta_i(x) \equiv a_{i1}\xi_1 + a_{i2}\xi_2 + \cdots + a_{in}\xi_n + a_i = 0 \quad (i=1,\ldots,m), \tag{1} \]
i.e., for finding a point \(x^*(\xi_1^*,\ldots,\xi_n^*)\) such that
\[ \max_{1\le i\le m} |\eta_i(x^*)| = \inf_x \max_{1\le i\le m} |\eta_i(x)| = L, \]
and in the case of consistency of system (1) this algorithm leads to one of its solutions.
In the present paper, while fully preserving the previous geometric scheme of the algorithm, we shall show how to use Jordan eliminations \((^4)\), which form the basis of the simplex method, for a substantial simplification of the numerical scheme of this algorithm; the stimulus for writing this work was the important paper of E. Stiefel \((^4)\).
- First we take an arbitrary point \(x'(\xi_1',\ldots,\xi_n')\) and write system (1), the coordinates of the point \(x'\), and the deviations \(\eta_1(x'),\ldots,\eta_m(x')\) in the form of the following table:
\[ \begin{array}{cccc} \xi_1' & \xi_2' & \cdots & \xi_n' & 1\\ \xi_1 & \xi_2 & \cdots & \xi_n & 1\\ \hline \eta_1 = & a_{11} & a_{12} & \cdots & a_{1n} & a_1 & \eta_1(x')\\ \cdots & \cdots & \cdots & \cdots & \cdots & \cdots & \cdots\\ \eta_m = & a_{m1} & a_{m2} & \cdots & a_{mn} & a_m & \eta_m(x') \end{array} \tag{2} \]
Let \(|\eta_r(x')| > |\eta_i(x')|\) \((i \ne r)\), and suppose that in the \(r\)-th row \(a_{rs}\) is the coefficient greatest in absolute value. Then we interchange the roles of the independent variable \(\xi_s\) and the dependent variable \(\eta_r\), solving the maximal equation, i.e., the equation
\[ \eta_r = a_{r1}\xi_1 + a_{r2}\xi_2 + \cdots + a_{rs}\xi_s + \cdots + a_{rn}\xi_n + a_r, \]
which has the maximal* deviation at the point \(x'\), with respect to \(\xi_s\), and substituting it into all the remaining equations. For this we perform on table (2) one step of Jordan elimination with the element \(a_{rs}\) as the pivot, i.e., according to \((^4)\) we form a new table
\[ \begin{array}{cccccc} \xi_1 & \cdots & \eta_r & \cdots & \xi_n & 1\\ \hline \eta_1 = & c_{11} & \cdots & a_{1s} & \cdots & c_{1n} & c_1\\ \cdots & \cdots & \cdots & \cdots & \cdots & \cdots & \cdots\\ \xi_s = & -a_{r1} & \cdots & 1 & \cdots & -a_{rn} & -a_r : a_{rs},\\ \cdots & \cdots & \cdots & \cdots & \cdots & \cdots & \cdots\\ \eta_m = & c_{m1} & \cdots & a_{ms} & \cdots & c_{mn} & c_m \end{array} \tag{3} \]
* Here, as in what follows, by the maximal deviation we shall mean the deviation maximal in absolute value.
in which \(\xi_s\) is interchanged with \(\eta_r\), and the elements are computed according to the following rules: 1) the pivot element is replaced by unity; 2) for the remaining elements of the pivot row (the \(r\)-th), only the signs are changed to the opposite ones; 3) the remaining elements of the pivot column (the \(s\)-th) are left unchanged; 4) every element \(c_{ik}\) not belonging to the pivot row or column is computed by the formula \(c_{ik}=a_{ik}a_{rs}-a_{rk}a_{is}\); 5) all elements are divided by the pivot element \(a_{rs}\).
- In the general case, suppose
\[ |\eta_{r_1}(x')|=\ldots=|\eta_{r_p}(x')|>|\eta_i(x')|\quad (i\ne r_1,\ldots,r_p), \]
and
\[ \delta_{r_1}\eta_{r_1}(x')=\ldots=\delta_{r_p}\eta_{r_p}(x'),\qquad \delta_{r_1}=\pm 1,\ldots,\delta_{r_p}=\pm 1. \]
We shall denote the point \(x'\) by \(x_p\) and call it the \(p\)-th approximation. We successively carry out steps of Jordan eliminations with those of the rows \(r_1\)-th, \(\ldots\), \(r_p\)-th as pivots in which there are nonzero coefficients at the \(\xi_i\) remaining after the preceding steps, each time taking as pivot the coefficient of greatest absolute value among the indicated coefficients of the given row. Suppose that this has been accomplished with all our \(p\) rows and, for convenience of notation, suppose that \(r_1=1,\ldots,r_p=p\), and that the indices of the interchanged variables in the Jordan eliminations performed coincide, so that the new tableau has the form
\[ \begin{array}{cccccc} \eta_1(x_p)\ \ldots\ \eta_p(x_p) & \xi'_{p+1} & \ldots & \xi'_n & 1 \\ \eta_1\ \ldots\ \eta_p & \xi_{p+1} & \ldots & \xi_n & 1 \end{array} \tag{4} \]
\[ \begin{array}{c|ccccc|c} \eta_{p+1}= & a^{(p)}_{p+1,1}\ \ldots\ a^{(p)}_{p+1,p} & a^p_{p+1,p+1} & \ldots & a^p_{p+1,n} & a^{(p)}_{p+1} & \eta_{p+1}(x_p) \\ \ldots & \ldots\ldots\ldots\ldots\ldots & \ldots & \ldots & \ldots & \ldots & \ldots \\ \eta_m= & a^{(p)}_{m1}\ \ldots\ a^{(p)}_{mp} & a^{(p)}_{m,p+1} & \ldots & a^{(p)}_{mn} & a^{(p)}_m & \eta_m(x_p) \end{array} \]
The expressions for the substituted \(\xi_i\), both here and in what follows, are written out separately. They will be needed only at the end, in order to express the obtained solution in the old coordinates.
From the point \(x_p\) we move along the straight line formed by the intersection of the \((n-p+1)\)-dimensional bisector plane
\[ \delta_1\eta_1=\delta_2\eta_2=\ldots=\delta_p\eta_p \]
and the \(p\)-dimensional coordinate plane
\[ \xi_{p+1}=\xi'_{p+1},\ldots,\xi_n=\xi'_n, \]
in the direction of decreasing maximal deviations
\[ |\eta_1(x)|=\ldots=|\eta_p(x)| \]
until, while preserving equality among themselves and decreasing, they first become equal to the deviation from one more (or several more) of the remaining planes. For this purpose we put
\[ \eta_1=\delta_1\eta,\ldots,\eta_p=\delta_p\eta \]
and solve \(2(m-p)\) equations with one unknown \(\eta\),
\[ \pm \eta = a^{(p)}_{i1}\delta_1\eta+\ldots+a^{(p)}_{ip}\delta_p\eta + \left[ \eta_i(x_p)-a^{(p)}_{i1}\eta_1(x_p)-\ldots-a^{(p)}_{ip}\eta_p(x_p) \right], \]
\[ (i=p+1,p+2,\ldots,m). \tag{5} \]
From the solutions we choose the largest positive
\[ \eta=\eta^{(p+1)} \]
that is smaller than
\[ \eta^{(p)}=|\eta_1(x_p)|. \]
In the general case the maximal deviation \(\eta^{(p+1)}\) is attained by several equations from (5) with definite signs. Again, for convenience of notation, suppose that these equations are the first \(t\) of these equations. For the new \((p+t)\)-th approximation \(x_{p+t}\) we shall have
\[ |\eta_1(x_{p+t})|=\ldots=|\eta_{p+t}(x_{p+t})|>|\eta_i(x_{p+t})|\quad (i>p+t), \]
\[ \delta_1\eta_1(x_{p+t})=\ldots=\delta_{p+t}\eta_{p+t}(x_{p+t}). \]
We continue the process of moving up the tableau the maximal deviations
\[ \eta_{p+1}(x_{p+t}),\ldots,\eta_{p+t}(x_{p+t}) \]
until we arrive at such rows with maximal deviation which can no longer be made pivots for the corresponding steps of Jordan eliminations, since in these rows all coefficients at the remaining \(\xi_i\) are equal to zero, or since no \(\xi_i\) remains at the top of the tableau. Suppose, for example, that for the approximation \(x_q\) the first \(q\) deviations have turned out to be maximal, and among them
it has been possible to move to the top of the table only the first \(r\) deviations, where \(r<q\), so that a table of the following form has been obtained:
\[ \begin{array}{c|cccccc|c} & \eta_1(x_q)\ \cdots\ \eta_r(x_q) & \xi'_{r+1}\ \cdots\ \xi'_n & 1 \\ & \eta_1\ \cdots\ \eta_r & \xi_{r+1}\ \cdots\ \xi_n & 1 \\ \hline \eta_{r+1}= & a^{(q)}_{r+1,1}\ \cdots\ a^{(q)}_{r+1,r} & 0\ \cdots\ 0 & a^{(q)}_{r+1} & \eta_{r+1}(x_q) \\ \cdots & \cdots & \cdots & \cdots & \cdots \\ \eta_q= & a^{(q)}_{q1}\ \cdots\ a^{(q)}_{qr} & 0\ \cdots\ 0 & a^{(q)}_{q} & \eta_q(x_q) \\ \eta_{q+1}= & a^{(q)}_{q+1,1}\ \cdots\ a^{(q)}_{q+1,r} & a^{(q)}_{q+1,r+1}\ \cdots\ a^{(q)}_{q+1,n} & a^{(q)}_{q+1} & \eta_{q+1}(x_q) \\ \cdots & \cdots & \cdots & \cdots & \cdots \\ \eta_m= & a^{(q)}_{m1}\ \cdots\ a^{(q)}_{mr} & a^{(q)}_{m,r+1}\ \cdots\ a^{(q)}_{mn} & a^{(q)}_{m} & \eta_m(x_q) \end{array} \tag{6} \]
Then, if all the free terms \(a^{(q)}_{r+1},\ldots,a^{(q)}_{q}\) are equal to zero, we continue the process as after obtaining the point \(x_p\), only in the equations corresponding to equations (5) one must put \(i=q+1,\ldots,m\). If, however, at least one of these free terms is different from zero, then this is a sign that the point \(x_p\) is stationary \(({}^3)\), so that it is impossible, moving from it, to decrease all the maximal deviations while preserving their equality among themselves.
- Let the point \(x_q\) be stationary. By an edge of dimension \(n-r\) we shall mean a linear manifold satisfying \(r\) equations, which we obtain by setting equal to zero any \(r\) linearly independent ones among the maximal \(\eta_1,\ldots,\eta_q\). For \(r=n\) the edge turns into a point. The characteristic of an edge \(({}^3)\) will be the sum of the number of maximal planes passing through the edge and the number of maximal planes separating the edge from the stationary point \(x_q\).
For further application of the algorithm we find the characteristic of the “upper” edge \(\eta_1=0,\ldots,\eta_r=0\), formed by the maximal planes that it was possible to move to the top of the table. It is obviously equal to the number \(r\) plus the number of those among the free terms \(a^{(q)}_{r+1},\ldots,a^{(q)}_{q}\) that are equal to zero (this sum is equal to the number of maximal planes passing through our edge), plus the number of those among the free terms \(a^{(q)}_{r+1},\ldots,a^{(q)}_{q}\) whose signs are opposite to the signs of the corresponding deviations \(\eta_{r+1}(x_q),\ldots,\eta_q(x_q)\) (this is the number of maximal planes separating our edge from the point \(x_q\)).
If among the free terms \(a^{(q)}_{r+1},\ldots,a^{(q)}_{q}\) there is not one with sign equal to the sign of the corresponding deviation, so that the characteristic of the upper edge is equal to \(q\) (i.e. is maximal), then we move toward this edge in the following way. We set \(\eta_1=\delta_1\eta,\ldots,\eta_r=\delta_r\eta\) and solve the equations
\[ \pm\eta = a_{i1}\delta_1\eta+\cdots+a_{ir}\delta_r\eta+ \left[ \eta_i(x_q)-a_{i1}\delta_1\eta_1(x_q)-\cdots-a_{ir}\delta_r\eta_r(x_q) \right] \qquad (i=q+1,\ldots,m) \]
and also similar equations formed from those maximal rows whose free terms are different from zero. From the solutions we choose the largest positive \(\eta=\eta^{(q+1)}\), smaller than \(\eta^{(q)}=|\eta_1(x_q)|\). It is attained by one or several equations, and the corresponding deviations, together with the \(r\) deviations located at the top of the table and those among \(\eta_{r+1}(x_q),\ldots,\eta_q(x_q)\) which correspond to free terms equal to zero, form the set of maximal deviations at the new stationary point; moreover, they are smaller than the maximal deviations at the point \(x_q\).
If, however, the characteristic of the upper edge is less than \(q\), then we compute the characteristic of the edge formed by \(r-1\) maximal planes that have appeared at the top of the table, and by one of the maximal planes \(\eta_{r+1}=0,\ldots,\eta_q=0\) with a free term different from zero, for example, by the plane \(\eta_{r+1}=0\). For this we perform one step of Jordan elimination with a pivot element from the row \(a^{(q)}_{r+1,1},\ldots,a^{(q)}_{r+1,r}\), after which con—
the edge under consideration will become an upper one, and its characteristic is easily computed as indicated above. We note that this step of Jordan elimination need not be carried out with the whole tableau, but only with the first \(q-r\) constant terms; moreover, it is necessary only to determine their signs. Thus we successively compute the characteristics of the edges formed by the maximal planes \(\eta_1=0,\ldots,\eta_q=0\), until we find an edge with maximal characteristic \(q\), or until we are convinced that every edge has characteristic less than \(q\). In the first case we complete the Jordan elimination step for the remaining part of the tableau and move to this edge with characteristic equal to \(q\), i.e., we continue the process as described above. In the second case it can be shown \({}^{(3)}\) that the process is finished and the point \(x_q\) is the one sought. It is clear that after a finite number of steps we shall necessarily arrive at the required point.
- The algorithm set forth applies without change to the problem of finding a Chebyshev point of a system of linear inequalities
\(\eta_i(x)\equiv a_{i1}\xi_1+\ldots+a_{in}\xi_n+a_i\leqslant 0\) \((i=1,\ldots,m)\) (cf. \({}^{(5)}\)), i.e., a point \(x^*(\xi_1^*,\ldots,\xi_n^*)\) for which
\(L=\inf_x \max_{1\leqslant i\leqslant m}\eta_i(x)\) is attained. It is only necessary, instead of the absolute values of the deviations, to consider the deviations themselves. If \(L\leqslant 0\), then the system is consistent (solvable), \(|L|\) is the stability (see \({}^{(7)}\)) of its solvability, and the point \(x^*\) at which the value \(L\) is attained (for \(L\ne-\infty\)) is a Chebyshev solution of the system. If \(L>0\), the system is inconsistent, \(L\) is its minimal deviation, and the point \(x^*\) is its Chebyshev approximation. If some solution of the system is required, and not necessarily \(x^*\), then we stop applying the algorithm as soon as it has been obtained. Another algorithm for finding \(x^*\) is indicated in \({}^{(7)}\).
Remark 1. The first steps of the algorithm described (up to the obtaining of a stationary point) differ somewhat from its original variant (see, for example, \({}^{(3)}\)) and are close to the first steps of the new algorithm of Goldstein and Cheney \({}^{(5)}\).
Remark 2. Suppose, as is usually the case, that for the stationary point \(x_q\) in tableau (6) only one maximal
\(\eta_q=a_{q1}^{(q)}\eta_1+\ldots+a_{q,q-1}^{(q)}\eta_{q-1}+a_q^{(q)}\)
has not been shifted upward. Then the rule for finding an edge with maximal characteristic is considerably simplified: the edge
\(\eta_1=\ldots=\eta_{k-1}=\eta_{k+1}=\ldots=\eta_q=0\) \((k=1,\ldots,q-1)\) has maximal characteristic if and only if \(a_{qk}^{(q)}\ne 0\) and
\(\operatorname{sign}\eta_k(x_q)=\operatorname{sign}(a_q^{(q)}/a_{qk}^{(q)})\).
Remark 3. Suppose that, in the case of system (1), \(x_q\) is a stationary point (tableau (6)). Then \(\overline L=|\eta_1(x_q)|\) gives an upper estimate for the magnitude of the minimal deviation \(L\) of system (1). A lower estimate for \(L\) is easily obtained by taking the minimal deviation \(\underline L\) of the subsystem composed of the maximal equations \(\eta_1=0,\ldots,\eta_r=0\), shifted upward, and of some maximal equation not shifted upward, for example, \(\eta_{r+1}=0\). We have
\(\eta_{r+1}=a_{r+1,1}^{(q)}\eta_1+\ldots+a_{r+1,r}^{(q)}\eta_r+a_{r+1}^{(q)}\), and by the Vallée Poussin formula (\({}^{(6)}\), see also \({}^{(4)}\))
\[ \underline L= \frac{\left|a_{r+1}^{(q)}\right|} {1+\left|a_{r+1,1}^{(q)}\right|+\ldots+\left|a_{r+1,r}^{q}\right|}. \]
Kyiv Technological Institute
of the Food Industry
Received
12 III 1961
CITED LITERATURE
\({}^{1}\) S. I. Zukhovitskii, Some Questions in the Theory of Chebyshev Approximations, Dissertation, Kiev, 1950.
\({}^{2}\) S. I. Zukhovitskii, DAN, 79, No. 4 (1951).
\({}^{3}\) S. I. Zukhovitskii, Matem. sborn., 33 (75), issue 2 (1953).
\({}^{4}\) E. Stiefel, Numer. Math., 2 (1960).
\({}^{5}\) A. A. Goldstein, F. Cheney, Pacific J. Math., 3, No. 8 (1958).
\({}^{6}\) Ch. J. de la Vallée Poussin, Ann. Soc. Sci. Bruxelles, 2-e partie, mémoires 35 (1911).
\({}^{7}\) G. Sh. Rubinshtein, DAN, 100, No. 4 (1955).