Full Text
MATHEMATICS
S. I. ZUKHOVITSKII, R. A. POLYAK, M. E. PRIMAK
AN ALGORITHM FOR SOLVING THE PROBLEM OF CONVEX CHEBYSHEV APPROXIMATION
(Presented by Academician N. N. Bogolyubov on 18 I 1963)
1. Suppose that an inconsistent system of linear complex equations is given
\[ \Delta_j(z) \equiv \alpha_{j1}z_1+\alpha_{j2}z_2+\cdots+\alpha_{jn}z_n+\alpha_j=0 \qquad (j=1,\ldots,m), \tag{1} \]
where \(z \equiv (z_1,\ldots,z_n)\); \(z_k=x_k+iy_k\); \(\alpha_{jk}=a_{jk}+ib_{jk}\); \(\alpha_j=a_j+ib_j\) \((k=1,\ldots,n;\ j=1,\ldots,m)\).
The problem of Chebyshev approximation of the system (1) consists in finding a Chebyshev point of this system, i.e., a point \(z^* \equiv (z_1^*,\ldots,z_n^*)\) for which
\[ \max_{1\le j\le m} |\Delta_j(z^*)| = \min_z \max_{1\le j\le m} |\Delta_j(z)|. \tag{2} \]
The function \(u=\max_j|\Delta_j(z)|\) is convex and piecewise smooth, so that problem (1)—(2) is a particular case of the problem of finding the minimum of an arbitrary convex piecewise-smooth function \(f(x)\equiv f(x_1,\ldots,x_s)\):
\[ \min_x f(x) \equiv \min_{x_1,\ldots,x_s} f_1(x_1,\ldots,x_s), \tag{3} \]
for example, one formed with the aid of a system of convex smooth functions \(\varphi_1(x),\ldots,\varphi_m(x)\) in the following way:
\[ f(x)=\max_j \varphi_j(x), \]
i.e., the problem of convex Chebyshev approximation.
In \((^1)\) an algorithm was outlined for solving problem (1)—(2). Another algorithm for solving this problem, differing from ours also in the method of minimizing \(u=\max_j|\Delta_j(z)|\), is given in \((^2)\). The case of a smooth function \(f(x)\) was considered in \((^3)\). In the present paper a further development and refinement of the algorithm \((^1)\) is given, which proved applicable also to the solution of the general problem (3), and a proof of its convergence is given. For the sake of concreteness we shall present the exposition as applied to problem (1)—(2).
2. As the initial approximation to the point \(z^*\) take an arbitrary point \(z^{(0)}\) and an arbitrary sufficiently small \(\delta_0>0\). If
\[ |\Delta_{j_0}(z^{(0)})|^2>|\Delta_j(z^{(0)})|^2+\delta_0 \qquad (j\ne j_0), \]
i.e., all nonmaximal deviations \(|\Delta_j(z^{(0)})|^2\) \((j\ne j_0)\) differ from the maximal \(|\Delta_{j_0}(z^{(0)})|^2\) by more than \(\delta_0\), then as the descent direction \(\zeta^{(0)}\equiv(\zeta_1^{(0)},\ldots,\zeta_n^{(0)})\) we take the direction of the gradient (more precisely, the antigradient) of the function \(|\Delta_{j_0}(z)|^2\) at the point \(z^{(0)}\), which, as usual, we determine from the condition that the derivative
\[ \frac{d}{dt}\left[|\Delta_{j_0}(z^{(0)}+t\zeta)|^2\right]_{t=0} = 2\operatorname{Re}\left[ \overline{\Delta_{j_0}(z^{(0)})} \sum_{k=1}^{n}\alpha_{j_0k}\zeta_k \right] \]
be the greatest in absolute value and negative for
\[ \|\zeta\|=\left(\sum_{k=1}^{n}|\zeta_k|^2\right)^{1/2}=C, \]
where \(C\) is an arbitrary positive constant.
We obtain
\[ \xi_k^{(0)}=\lambda \Delta_{j_0}(z^{(0)})\,\overline{a}_{j_0 k}\qquad (k=1,\ldots,n), \tag{4} \]
where \(\lambda<0\) is an arbitrary multiplier.
In the case when
\[ \max_j |\Delta_j(z^{(0)})|^2=|\Delta_{j_0}(z^{(0)})|^2,\quad |\Delta_{j_0}(z^{(0)})|^2-|\Delta_{j_\nu}(z^{(0)})|^2\leq \delta_0 \]
\[ (\nu=1,\ldots,p), \tag{5} \]
\[ |\Delta_{j_0}(z^{(0)})|^2>|\Delta_j(z^{(0)})|^2+\delta_0 \quad (j\ne j_1,\ldots,j_p), \]
we define the direction \(\xi^{(0)}\) of descent, first, from the condition that the derivatives in this direction of all functions
\(|\Delta_{j_0}(z)|^2,\ldots,|\Delta_{j_p}(z)|^2\) be negative at the point \(z^{(0)}\), i.e., from the condition that \(\xi^{(0)}\) satisfy the system of linear inequalities
\[ \operatorname{Re}\left[\overline{\Delta}_{j_\nu}(z^{(0)})\sum_{k=1}^{n} a_{j_\nu k}\xi_k\right]<0, \quad (\nu=0,1,\ldots,p), \tag{6} \]
and, second, from the condition that in this case the steepest descent be realized, i.e., that the least of the absolute values of the derivatives, bounded, for example, by the conditions
\[ |\operatorname{Re}\xi_k|\leq C,\qquad |\operatorname{Im}\xi_k|\leq C \quad (k=1,\ldots,n) \tag{7} \]
be maximal. Taking into account the negativity of the derivatives, it should be assumed that among the directions determined by the system (6)—(7), the desired \(\xi^{(0)}\) must be Chebyshev, i.e., for it
\[ \max_{0\leq \nu\leq p} \operatorname{Re}\left[ \overline{\Delta}_{j_\nu}(z^{(0)})\sum_{k=1}^{n} a_{j_\nu k}\xi_k^{(0)} \right] = \min_{\xi}\max_{0\leq \nu\leq p} \operatorname{Re}\left[ \overline{\Delta}_{j_\nu}(z^{(0)})\sum_{k=1}^{n} a_{j_\nu k}\xi_k \right]. \tag{8} \]
By introducing the additional variable \(\xi\), this problem reduces to the following linear programming problem: find the direction \(\xi^{(0)}\) minimizing the function
\[ v=\xi \tag{9} \]
under the constraints
\[ \operatorname{Re}\left[ \overline{\Delta}_{j_\nu}(z^{(0)})\sum_{k=1}^{n} a_{j_\nu k}\xi_k \right]-\xi\leq 0 \quad (\nu=0,1,\ldots,p) \]
\[ |\operatorname{Re}\xi_k|\leq C,\qquad |\operatorname{Im}\xi_k|\leq C \quad (k=1,\ldots,n), \tag{10} \]
where \(C>0\) is an arbitrary constant. Denote \(\xi_0=\min \xi\).
- Let the direction \(\xi^{(0)}\) of descent already be determined, and in moving in this direction, i.e., as \(t\) increases, let, for example, \(|\Delta_{j_0}(z^{(0)}+t\xi^{(0)})|^2\) decrease more slowly than all the others, i.e., let
\[ \operatorname{Re}\left[ \overline{\Delta}_{j_0}(z^{(0)})\sum_{k=1}^{n} a_{j_0 k}\xi_k^{(0)} \right] > \operatorname{Re}\left[ \overline{\Delta}_{j_\nu}(z^{(0)})\sum_{k=1}^{n} a_{j_\nu k}\xi_k^{(0)} \right] \quad (\nu=1,\ldots,p), \]
and, in the case of equality of any two of these quantities, for example, for \(\nu=0\) and \(\nu=1\), let
\[ \left|\sum_{k=1}^{n} a_{j_0 k}\xi_k^{(0)}\right|^2 > \left|\sum_{k=1}^{n} a_{j_1 k}\xi_k^{(0)}\right|^2. \]
Then one may move in this direction \(\xi^{(0)}\) only until either the value
\[ t'=-\frac{ \operatorname{Re}\left[ \overline{\Delta}_{j_0}(z^{(0)})\sum_{k=1}^{n} a_{j_0 k}\xi_k^{(0)} \right] }{ \left|\sum_{k=1}^{n} a_{j_0 k}\xi_k^{(0)}\right|^2 }, \]
minimizing the function \(\left|\Delta_{j_0}(z^{(0)}+t\xi^{(0)})\right|^2\), or the value \(t''\) at which \(\left|\Delta_{j_0}(z^{(0)}+t\xi^{(0)})\right|^2\) first becomes equal to the value \(\left|\Delta_j(z^{(0)}+t\xi^{(0)})\right|^2\) for some \(j\ne j_0\), i.e., the value \(t''\) equal to the least of the positive roots, with respect to \(t\), of the quadratic equations
\[ \left|\Delta_{j_0}(z^{(0)}+t\xi^{(0)})\right|^2 = \left|\Delta_j(z^{(0)}+t\xi^{(0)})\right|^2 \qquad (j\ne j_0). \]
As the approximation step we take \(t_0=\min\{t',t''\}\), and as the new approximation—the point \(z^{(1)}=z^{(0)}+t_0\xi^{(0)}\).
We take the point \(z^{(1)}\) thus obtained as the initial one, put \(\delta_1=\delta_0\), and determine the direction of descent, continuing the process until, for some \(z^{(k)}\) and the corresponding \(\delta_k\) \((k=0,1,2,\ldots)\), the corresponding linear programming problem of type (9)—(10) does not lead to a minimum \(\xi_k\) of the function \(\upsilon=\xi\) such that \(\xi_k\ge -\delta_k\). Then, instead of \(\delta_k\), we take a new \(\delta_{k+1}>0\), which we subject to the following three conditions: 1) \(\delta_{k+1}<\delta_k/2\), 2) \(\delta_{k+1}<|\xi_k|\) (if \(\xi_k<0\), but \(|\xi_k|<\delta_k\)), 3) \(\delta_{k+1}\) is such that all nonmaximal deviations \(|\Delta_j(z^{(k)})|^2\) differ from the maximal ones by more than \(\delta_{k+1}\). We continue the process, taking \(\delta_{k+1}\) instead of \(\delta_k\).
Condition 3) leads to the fact that, in determining the direction of descent at this step, only the maximal deviations will participate. If, moreover, it turns out that \(\xi_{k+1}\ge 0\), then the process ends and the point \(z^{(k)}\) is a Chebyshev point (when \(\xi_{k+1}=0\), it is possible that \(z^{(k)}\) is one of the Chebyshev points). In the case \(\xi_{k+1}<0\), we continue the process. For convenience of notation, when obtaining the new approximation \(z^{(k+1)}\) from \(z^{(k)}\), we also increase by one the index of the corresponding \(\delta_k\), even if \(\delta_k\) has remained unchanged, so that repetitions may occur among \(\{\delta_n\}\). By construction, the sequence \(\{\delta_n\}\) is nonincreasing.
- It is not difficult to justify the convergence of the process described. Suppose that a sequence of approximations \(\{z^{(n)}\}\) and the corresponding sequence \(\{\delta_n\}\) have been obtained. The monotonically decreasing sequence of maximal deviations \(\{\max_j |\Delta_j(z^{(n)})|^2\}\) converges, and the sequence \(\{z^{(n)}\}\) is bounded. Let, further, \(\tilde z\) be a limit point of this sequence and let the subsequence \(\{z^{(n_k)}\}\) converge to \(\tilde z\). We first note that \(\lim_{n\to\infty}\delta_n=\delta=0\), since otherwise, for \(\delta>0\), by condition 2) for the construction of \(\delta_{k+1}\), we would have \(\xi_n<-\delta\) for all \(n\), which is impossible, since then \(\max_j|\Delta_j(z^{(n)})|^2\to-\infty\).
Suppose that at the point \(\tilde z\) we have
\[ |\Delta_{j_1}(\tilde z)|^2=\cdots=|\Delta_{j_q}(\tilde z)|^2>|\Delta_j(\tilde z)|^2 \]
\[ (j\ne j_1,\ldots,j_q). \]
For \(z^{(n_k)}\) sufficiently close to \(\tilde z\), owing to the convergence of \(\delta_{n_k}\) to zero, the direction of descent will be described only with the aid of the deviations \(|\Delta_{j_\nu}(z^{(n_k)})|^2\) \((\nu=1,\ldots,q)\) (possibly not all of them). Suppose that at the point \(\tilde z\) the system of type (9)—(10) is consistent, i.e.,
\[ \tilde \xi = \min_{\xi}\max_{1\le \nu\le q} \operatorname{Re} \left[ \Delta_{j_\nu}(\tilde z) \sum_{k=1}^{n}\alpha_{j_\nu,k} \right] <0, \]
so that \(\tilde z\) is not a Chebyshev point. But then, for sufficiently large \(n_k\), we would have \(|\xi_{n_k}|\ge |\tilde \xi|/2\), which is impossible, since from \(\delta_{n_k}\to 0\) it follows that \(\xi_{n_k}\to 0\).
It can be shown that even in the case of nonuniqueness of the Chebyshev point the whole sequence \(\{z^{(n)}\}\) converges to \(\tilde z\).
The sequence \(\{z^{(n)}\}\) is bounded even in the case when the set of Chebyshev points of system (1) is unbounded. The latter occurs if and only if the rank of the matrix \(\|\alpha_{jk}\|\) is less than \(n\). In this case the set of Chebyshev points is the direct product of a bounded set by the linear subspace \(C_0\) of null solutions of the homogeneous
system obtained from (1). The sequence \(\{z^{(n)}\}\) lies in one and the same hyperplane orthogonal to \(C_0\), and is bounded there.
- The preceding algorithm also solves the general problem of convex Chebyshev approximation under the assumption that the set of Chebyshev points is nonempty and bounded. In this case the role of \(|\Delta_i(z)|^2\) is played by \(\varphi_i(x)\), and system (6) is transformed into the following system of inequalities linear with respect to \(\eta_k\):
\[ \sum_{k=1}^{s} \frac{\partial \varphi_{j_\nu}}{\partial x_k}\eta_k < 0 \qquad (\nu = 0,1,\ldots,p). \]
The system (9)—(10) is written analogously. The determination of \(t'\) is reduced to finding the minimum of a convex smooth function of one variable \(t\). In all other respects the arguments coincide.
- We note that the algorithm of item 5 solves the problem of finding a solution (even a Chebyshev point) of the system of nonlinear inequalities
\[ \varphi_j(x) \leqslant 0 \qquad (j = 1,\ldots,m), \]
where \(\varphi_j\) are convex smooth functions, i.e. points \(x^*\) for which
\[ \max_j \varphi_j(x^*) = \min_x \max_j \varphi_j(x). \]
- The algorithm may be used in a somewhat weakened variant, moving from the point \(z^{(k)}\) not in the Chebyshev direction \(\xi^{(k)}\), but in some direction \(\hat{\xi}^{(k)}\) satisfying system (6) and such that
\[ \left| \max_{0 \leq \nu \leq p} \operatorname{Re} \left[ \overline{\Delta_{j_\nu}(z^{(k)})} \sum_{l=1}^{n} \alpha_{j_\nu,l}\hat{\xi}^{(k)}_l \right]\right| > \varepsilon |\xi_k|, \]
where \(\varepsilon\) \((0<\varepsilon<1)\) is fixed. In particular, one could require that the derivatives of
\(|\Delta_{j_\nu}(z^{(k)}+t\xi)|^2\) \((\nu=0,1,\ldots,p)\) at \(t=0\) be equal, negative, and maximal in absolute value (for \(p \geq n\) this may prove infeasible).
Kyiv State Pedagogical Institute
named after A. M. Gorky
Received
2 I 1963
REFERENCES
¹ S. I. Zukhovitskii, Some Questions in the Theory of Chebyshev Approximations, Dissertation, Kiev, 1950.
² R. P. Chakina, in: Studies on Contemporary Problems of the Constructive Theory of Functions, Moscow, 1961.
³ V. V. Ivanov, DAN, 143, No. 4 (1962).