Full Text
UDC 519.95
MATHEMATICS
V. A. YAKUBOVICH
RECURRENT FINITELY CONVERGENT ALGORITHMS FOR SOLVING SYSTEMS OF INEQUALITIES
(Presented by Academician V. I. Smirnov on 11 VI 1965)
1°. Consider the infinite system of inequalities
\[ \varphi(x,a_j)>0 \qquad (j=1,2,\ldots), \tag{1} \]
where \(x\) is the unknown vector of a Euclidean or real Hilbert space \(R_1\); \(a_j\) are arbitrary vectors of some set \(M\) of a Euclidean or real Hilbert space \(R_2\); \(\varphi(x,a)\) is a real function. Suppose that \(\varphi(x_*,a)\ge \varepsilon_*>0\) holds for some \(\varepsilon_*>0\), \(x_*\in R_1\), and all \(a\in M\), where, however, \(\varepsilon_*\) and \(x_*\) are unknown. Let \(f_j(x,a)\) \((j=1,2,\ldots)\) be certain mappings \(R_1\times M\) into \(R_1\). We shall say that the relations \(x_{j+1}=f_j(x_j,a_j)\) \((j=1,2,\ldots)\) define a finitely convergent algorithm for solving the inequalities (1) if: a) a set \(G\subseteq R_1\) is specified such that, for any \(x_1\in G\) and any sequence \(a_j\in M\) \((j=1,2,\ldots)\), for the vectors \(x_1, x_2=f_1(x_1,a_1),\ldots\), starting with some \(n\), one has \(x_n=x_{n+1}=\cdots=x^0\); b) the value \(x^0\) satisfies the inequalities (1) for \(j\ge n\). We note that, generally speaking, the vector \(a_j\) in (1) may depend on \(x_1,\ldots,x_j\).
The problem of constructing finitely convergent algorithms for solving linear inequalities arises in the theory of trainable pattern-recognition systems \((^{1-4})\) as the problem of finding learning algorithms.
Finitely convergent algorithms may be used to solve a finite system of inequalities \(\varphi(x,a_j)>0\), \(j=1,\ldots,m\). The inequalities (1) in this case are obtained by periodic repetition of the given system \((a_{j+m}=a_j)\), and the condition for stopping the computation is determined by the condition that the relations \(x_n=x_{n+1}=\cdots=x_{n+m-1}\), \(\varphi(x_j,a_j)>0\), \(j=n,\ldots,n+m-1\), hold. In the general case, in the sense of the problem, the number \(n\) in condition a), and consequently also the vector \(x^0\), cannot be determined effectively by means of a finite number of operations. In this connection, in the general case the algorithm must also be supplemented by some stopping condition, which can be obtained by bringing in additional assumptions on the probability distribution of the vectors \(a\in M\) and by requiring that the probability of error in the “not shown” (or all) inequalities (1) be sufficiently small. These questions will not be considered here, and therefore stopping conditions will not be formulated.
A system (1) where \(a_{j+m}=a_j\) \((j=1,2,\ldots)\) will be called cyclic. The number \(r\) of distinct vectors in the sequence \(x_1,x_2,\ldots\) will be called the number of corrections of the algorithm.
The algorithms given below have a number of features in common with relaxation algorithms \((^{5-8})\). In addition to their simplicity, they possess great reliability, since a failure in a single inequality will lead (in the case of a cyclic system of inequalities and \(G=R_1\)) only to an increase in the computation time.
2°. Lemma. Suppose that there exist functions \(g_j(x,a)\), \(j=1,2,\ldots\), defined on \(R_1\times M\) with values in \(R_1\), a real function \(V(x)\) (where \(x\in R_1\)), a set \(G\subseteq R_1\), and a series \(\delta_1+\delta_2+\cdots=+\infty\) such that, for any sequence \(x_1\in G\), \(x_2=g_1(x_1,a_1),\ldots,x_{j+1}=\)
\[ = g_j(x_j,a_j),\quad \text{where } a_j \in M \text{ and } \varphi(x_j,a_j)\le 0,\quad V(x_j)-V(x_{j+1})\ge \delta_j. \]
Then the algorithm
\[ x_{j+1}=x_j,\quad \text{if } \varphi(x_j,a_j)>0;\qquad x_{j+1}=g_j(x_j,a_j),\quad \text{if } \varphi(x_j,a_j)\le 0, \tag{2} \]
is a finitely convergent algorithm for solving the inequalities (1) for \(x_1\in G\). The number of corrections \(r\) satisfies the relation
\[
\delta_1+\delta_2+\cdots+\delta_r \le V(x_1).
\]
Obviously, the condition of the lemma is satisfied if \(V(x)-V[g(x,a)]\ge \delta>0\) when \(\varphi(x,a)\ge 0\) for any \(x_1\in R_1\), \(a\in M\), and some \(\delta>0\).
Proof. Deleting from the sequence \(x_1,x_2,\ldots\), obtained by formula (2), the vectors \(x_j\) satisfying the condition \(\varphi(x_j,a_j)>0\), and changing the numbering, we obtain a sequence satisfying the condition of the lemma. For any \(n\) terms of this sequence,
\[ \delta_1+\cdots+\delta_n \le \sum_1^n [V(x_j)-V(x_{j+1})]=V(x_1)-V(x_n)\le \]
\[
\le V(x_1),
\]
whence the assertion of the lemma follows.
\(3^\circ\). Homogeneous linear inequalities, \(R_1=R_2\),
\[
\varphi(x,a_j)=(x,a_j),\ |a_j|\le \alpha,\ j=1,2,\ldots \; *
\]
Theorem 1. Let \(\rho',\rho'',\rho_j,\beta_j\) \((j=1,2,\ldots)\) be arbitrary numbers satisfying the inequalities \(0<\rho'\le \rho_j\le \rho''\), \(0\le \beta_j\le 2\). For any \(x_1\in R_1\) the algorithm
\[ x_{j+1}=x_j,\quad \text{if } (x_j,a_j)>0; \]
\[ x_{j+1}=x_j+\zeta_j a_j,\quad \zeta_j=\rho_j-\beta_j(x_j,a_j)/(a_j,a_j),\quad \text{if } (x_j,a_j)\le 0, \tag{3} \]
is a finitely convergent algorithm for solving the inequalities \((x,a_j)>0\), \((j=1,2,\ldots)\). For \(x_1=0\), for the number of corrections \(r\) the estimate
\[
r\le |x_*|^2\alpha^2\varepsilon_*^{-2}\rho'/\rho''
\]
is valid.
Proof. Choose a number \(\tau\) satisfying the inequality
\[
\tau>\tau_0=\alpha^2\rho''/2\varepsilon_*.
\]
For \(V(x)=|x-\tau x_*|^2\) we have, under the conditions of the lemma,
\[
V(x_j)-V(x_{j+1})
=-\zeta_j(a_j,2x_j+\zeta_j a_j-2\tau x_*)\ge
\delta(\tau)=2\varepsilon_*\rho'(\tau-\tau_0).
\]
By the lemma, the algorithm is finitely convergent. The estimate for \(r\) follows by the lemma from the relation
\[
r\le \min \tau^2|x_*|^2/\delta(\tau)\quad (\tau>\tau_0).
\]
Algorithm (3) for \(\beta_j=0,\ \zeta_j=\rho_j=1\) coincides with \((^1,^3,^4)\) **. For cyclic systems of inequalities it is natural to choose \(\beta_j,\rho_j\) so that \((x_{j+1},a_j)>0\) is satisfied, for example \(\rho_j=\rho>0,\ 1\le \beta_j\le 2\).
\(4^\circ\). Nonhomogeneous linear inequalities,
\[
\varphi(x,c_j,\gamma_j)=(x,c_j)+\gamma_j.
\]
In this and the following items we shall denote
\[
\eta_j=(x_j,c_j)+\gamma_j.
\]
Theorem 2. Suppose that \(|c_j|\le \chi,\ j=1,2,\ldots\). Let \(\rho_j>0\), \(\beta_j\) \((j=1,2,\ldots)\) be arbitrary numbers satisfying the conditions \(\rho_j\to 0\) as \(j\to\infty\), \(\rho_1+\rho_2+\cdots=\infty\), \(0\le \beta_j\le 2\). For any \(x_1\in R_1\) and \(k(1)=0\) the algorithm
\[
x_{j+1}=x_j,\quad k(j+1)=k(j),\quad \text{if } \eta_j>0;
\]
\[
x_{j+1}=x_j+\zeta_j c_j,\quad
\zeta_j=\rho_{k(j)}-\beta_j\eta_j/(c_j,c_j),\quad
k(j+1)=k(j)+1,\quad \text{if } \eta_j\le 0,
\]
is a finitely convergent algorithm for solving the inequalities
\[
(x,c_j)+\gamma_j>0\quad (j=1,2,\ldots).
\]
Proof. For \(V(x)=|x-x_*|^2\) we have, under the conditions of the lemma,
\[
V(x_j)-V(x_{j+1})
=\zeta_j\{-\zeta_j(2-\beta_j)\eta_j+[(x_*,c_j)+\gamma_j]-\rho_j(c_j,c_j)\}\ge
\]
\[
\ge \varepsilon_*\rho_j,
\]
starting with sufficiently large \(j\). The assertion of the theorem follows from the lemma.
Remark. Let \(\beta,\delta\) be arbitrary numbers such that \(0<\beta\le 2,\ \delta>0\). Under the conditions of Theorem 2, if
\[
-\eta_j/(c_j,c_j)\ge \delta>0,
\]
one may take \(\rho_j=0,\ 0<\beta\le \beta_j\le 2\). Indeed, in this case
\[
V(x_j)-V(x_{j+1})\ge \varepsilon_*\beta\delta>0.
\]
\[ \underline{\phantom{xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx}} \]
* Recall that, moreover, by the assumption of item \(1^\circ\), there exist \(\varepsilon_*>0\) and \(x_*\in R_1\) such that \((x_*,a_j)\ge \varepsilon_*>0\).
** The indices \((^1,^3,^4)\) differ from those given here, but give the same estimate for the number of corrections.
The algorithm given below in Theorem 3 apparently converges faster than the algorithm of Theorem 2*.
Theorem 3. Suppose that \((c_j,c_j)+\gamma_j^2 \leq \gamma^2\) \((j=1,2,\ldots)\). The following algorithm is a finitely convergent algorithm for solving the inequalities \((x,c_j)+\gamma_j>0\) \((j=1,2,\ldots)\) for arbitrary \(x_1 \in R_1\) and \(\xi_1>0\):
\[ \xi_{j+1}=\xi_j,\qquad x_{j+1}=x_j,\qquad \text{if }\eta_j>0, \tag{4} \]
\[ \xi_{j+1}=|\xi_j+\zeta_j\gamma_j|,\qquad x_{j+1}=\xi_{j+1}^{-1}(\xi_jx_j+\zeta_jc_j),\qquad \text{if }\eta_j\leq 0. \]
Here \(\zeta_j=\rho_j-\beta_j\xi_j\eta_j[(c_j,c_j)+\gamma_j^2]^{-1}\), and \(\rho_j,\beta_j\) are arbitrary numbers satisfying the conditions \(0<\rho'\leq \rho_j\leq \rho''\), \(0\leq \beta_j\leq 2\), \(\xi_{j+1}\ne 0\) \((j=1,2,\ldots)\). For the number of corrections \(r\), when \(x_1=0\), \(\xi_1=\gamma^2\rho''/2\varepsilon_*\), the estimate* \(r\leq |x_*|(|x_*|+\sqrt{|x_*|^2+1})\gamma^2\rho''/(2\varepsilon_*\rho')\) holds.
Proof. Obviously, \(\xi_j>0\). Let \(\tau>\tau_0=\rho''\gamma^2/2\varepsilon_*\). For
\[ V(x,\xi)=|\xi x-\tau x_*|^2+|\xi-\tau|^2 \]
we have, under the conditions of the lemma,
\[ V(x_j,\xi_j)-V(x_{j+1},\xi_{j+1}) = -2\xi_j\eta_j\zeta_j-\zeta_j^2[(c_j,c_j)+\gamma_j^2] +2\tau\zeta_j[(x_*,c_j)+\gamma_j] +2\tau(\xi_{j+1}-\xi_j-\gamma_j\zeta_j) \geq \delta(\tau)=2\varepsilon_*\rho'(\tau-\tau_0)>0. \]
The estimate for \(r\) is obtained in the same way as in the proof of Theorem 1.
Note that for \(1\leq \beta_j\leq 2\) one has \((x_{j+1},c_j)+\gamma_j>0\).
The following device makes it possible, by enlarging the space \(R_1\), to obtain a recurrent finitely convergent algorithm for solving inhomogeneous linear inequalities \((x,c_j)+\gamma_j>0\) from an analogous algorithm for homogeneous linear inequalities. Consider the system of inequalities \((y,c_j)+\xi\gamma_j>0\), where \(\xi\) is a real “additional parameter,” and insert between each pair of adjacent inequalities the inequality \(\xi>0\). We shall call the resulting homogeneous system the system (D). By assumption, \((x_*,c_j)+\gamma_j\geq \varepsilon_*>0\). Therefore the system (D) has the solution \(y=x_*, \xi=1\) with the same value of \(\varepsilon_*\). Applying to the system (D) any finitely convergent recurrent algorithm, combining two adjacent steps, and making the substitution \(y_j=\xi_jx_j\), we obtain the desired algorithm of the form \(x_{j+1}=f_j(x_j,\xi_j,c_j,\gamma_j)\), \(\xi_{j+1}=\varphi_j(x_j,\xi_j,c_j,\gamma_j)\). By the indicated device it is easy to obtain from Theorem 1 algorithms close to the algorithm of Theorem 3****.
For a cyclic system of linear inequalities, a simple finitely convergent algorithm was obtained by V. N. Fomin\({}^{(9)}\).
\(5^\circ.\) Inequalities \(|(x,c_j)+\gamma_j|<\varepsilon\) \((j=1,2,\ldots)\), where \(|c_j|\leq \varkappa\).
Theorem 4. Suppose that there exist a number \(\varepsilon_{**}>0\) and a vector \(x_*\in R_1\) such that \(|(x_*,c_j)+\gamma_j|\leq \varepsilon_{**}<\varepsilon/2\), \(j=1,2,\ldots\)***. For any \(x_1\in R_1\), the algorithm \(x_{j+1}=x_j\), if \(|\eta_j|<\varepsilon\), \(x_{j+1}=x_j-\eta_jc_j[(c_j,c_j)]^{-1}\), if \(|\eta_j|\geq \varepsilon\), is a finitely convergent algorith-
* More precisely, for arbitrary \(x_1\) the estimates from above for the number of corrections \(r\) (which are not given because of their unwieldiness) in the case of Theorem 3 are considerably more acceptable.
** The algorithm (4) (as well as the algorithm of Theorem 2) satisfies the definition given in § 1 for the space \(R_1'=R_1\times\{\xi\}\), where \(\{\xi\}\) is the number axis, if the given inequalities are considered as inequalities with respect to the vector
\[ x'= \begin{pmatrix} x\\ \xi \end{pmatrix} \in R_1'. \]
*** It is easy to obtain a (more unwieldy) estimate for arbitrary \(x_1,\xi_1>0\).
**** By the same device one can obtain finitely convergent algorithms for the case when a domain \(Q\subset R_1\) is known in advance such that \(x_*\in Q\) and it is required that \(x_j\in Q\). Suppose, for example, that the domain \(Q\) is defined by \(k\) inequalities \(\xi^{(h)}>0\), where \(\xi^{(h)}\) are certain components of the Euclidean vector \(x\). In this case, between each pair of adjacent given inequalities one should insert \(k\) inequalities \(\xi^{(h)}>0\), apply the corresponding algorithm to the system thus obtained, and combine \(k\) adjacent steps.
***** This condition strengthens the general assumption of item \(1^\circ\), according to which \(\varepsilon-|(x_*,c_j)+\gamma_j|\geq \varepsilon_*>0\), and, unlike the conditions of the other theorems, assumes a known estimate for the number \(\varepsilon_*\). It is fulfilled (for \(\varepsilon_{**}=0\)) in the important finite-dimensional case for applications when the system \((x,c_j)+\gamma_j=0\) (with determinant possibly equal to zero) is obtained by the method of least squares.
... of solutions of the inequalities \(|(x,c_j)+\gamma_j|<\varepsilon\) \((j=1,2,\ldots)\). The number of corrections does not exceed the number
\[ r_0=|x_1-x_*|^2\chi^2\varepsilon^{-1}(\varepsilon-2\varepsilon_*)^{-1}. \]
Proof. For \(V_j(x)=|x-x_*|^2\) we have
\[ -\Delta V_j=V(x_j)-V(x_{j+1})=\eta_j(\eta_j-2\varepsilon_j)/(c_j,c_j), \]
where \(\varepsilon_j=(x_*,c_j)+\gamma_j\). Since \(|\varepsilon_j|\leq \varepsilon_*\), for \(|\eta_j|\geq\varepsilon\) we have
\[ -\Delta V_j\geq \chi^{-2}(\varepsilon^2-2\varepsilon\varepsilon_*). \]
Applying the lemma, we obtain the assertion of the theorem.
Remark. If a number \(\varepsilon_0\) is known such that
\[ |(x_*,c_j)+\gamma_j|\leq \varepsilon_0<\varepsilon, \]
then it is easy to prove, by means of analogous arguments, that the following algorithm is a finitely convergent algorithm for solving the inequalities \(|(x,c_j)+\gamma_j|<\varepsilon\):
\[ x_{j+1}=x_j,\quad \text{if }|\eta_j|<\varepsilon; \]
\[ x_{j+1}=x_j-\eta_jc_j\bigl[(c_j,c_j)\bigr]^{-1},\quad \text{if }|\eta_j|\geq 2\varepsilon; \]
\[ x_{j+1}=x_j-(\eta_j-\varepsilon_0\operatorname{sign}\eta_j)c_j/(c_j,c_j), \quad \text{if }\varepsilon\leq|\eta_j|<2\varepsilon. \]
Theorem 5. Let \(\delta,\rho_j,\beta_j\) \((j=1,2,\ldots)\) be arbitrary numbers satisfying the conditions
\[ 0<\delta\leq\varepsilon,\qquad 0\leq\beta_j\leq 2,\qquad \rho_j>0,\qquad \rho_j\to 0\quad \text{as }j\to\infty,\qquad \rho_1+\rho_2+\cdots=\infty. \]
For any \(x_1\in B_1\) and \(k(1)=0\), the following algorithm is a finitely convergent algorithm for solving the inequalities
\[ |(x,c_j)+\gamma_j|>\varepsilon: \]
\[ x_{j+1}=x_j,\quad k(j+1)=k(j),\quad \text{if }|\eta_j|<\varepsilon; \]
\[ x_{j+1}=x_j-\eta_jc_j/(c_j,c_j),\quad k(j+1)=k(j),\quad \text{if }|\eta_j|\geq 2\varepsilon; \]
\[ x_{j+1}=x_j+2(\varepsilon\operatorname{sign}\eta_j-\eta_j)c_j/(c_j,c_j),\quad k(j+1)=k(j),\quad \text{if }\varepsilon+\delta\leq|\eta_j|<2\varepsilon; \]
\[ x_{j+1}=x_j+\xi_jc_j,\quad \xi_j=-\rho_{k(j)}+\beta_j(\varepsilon\operatorname{sign}\eta_j-\eta_j)/(c_j,c_j), \]
\[ k(j+1)=k(j)+1,\quad \text{if }\varepsilon\leq|\eta_j|<\varepsilon+\delta^*. \]
The proof is carried out with the function \(V(x)=|x-x_*|^2\), analogously to the proofs of Theorems 4 and 2.
6°. Suppose that the sets
\[ E_j=E\{\varphi(x,a_j)>0\}\subseteq R_1 \]
are convex, and that there is some algorithm
\[ c_j=c_j(z,a_j),\qquad \gamma_j=\gamma_j(z,a_j), \]
which, from the set \(E_j\) and from an arbitrary point \(z\in R_1\setminus E_j\), produces a plane
\[ (x,c_j)+\gamma_j=0 \]
separating the point \(z\) and the set \(E_j\):
\[ (z,c_j)+\gamma_j\leq 0,\qquad (x,c_j)+\gamma_j>0\quad \text{for }x\in E_j. \]
Let
\[ x_{j+1}=f_j(x_j,c_j,\gamma_j) \]
be some finitely convergent algorithm for solving the inhomogeneous inequalities
\[ (x,c_j)+\gamma_j>0 \]
for \(x_1\in G\). Then, according to the definition in §1°, the superposition of the indicated algorithms
\[ x_{j+1}=f_j\bigl[x_j,c_j(x_j,a_j),\gamma_j(x_j,a_j)\bigr] \]
will, for \(x_1\in G\), be a finitely convergent algorithm for solving the inequalities
\[ \varphi(x,a_j)>0^{**}. \]
In this way it is easy, for example, to obtain simple finitely convergent algorithms for solving the inequalities
\[ x_j+2(h_j,x)+(H_jx,x)>0, \]
where \(H_j=H_j^*>0\).
The author expresses his sincere gratitude to Academician V. I. Smirnov for a number of valuable comments.
Leningrad State University
named after A. A. Zhdanov
Received
6 VI 1965
CITED LITERATURE
- M. A. Aizerman, E. M. Braverman, L. I. Rozonoer, Avtomatika i telemekh., 25, No. 6 (1964).
- V. A. Yakubovich, Collection Computational Techniques and Programming Problems, L., 1965.
- A. V. J. Novikoff, Report at the Symposium on Math. Theory of Automata, Polytechnic Inst., Brooklin, April, 1962, p. 24.
- F. Rosenblatt, Proc. IRE, 42, No. 3 (1960).
- S. Agmon, Canad. J. Math., 6, No. 3 (1954).
- T. S. Motzkin, I. J. Schoenberg, Canad. J. Math., 6, No. 3 (1954).
- Yu. I. Merzlyakov, Zhurn. vychislit. matem. i matem. fiz., 2, No. 3, 482 (1962).
- I. I. Eremin, DAN, 160, No. 5 (1965).
- V. N. Fomin, Collection Computational Techniques and Programming Problems, L., 1965.
* With a regular fall of the points \(x_j\) into the strips \(\varepsilon\leq|\eta_j|<\varepsilon+\delta\), convergence of the algorithm will be slow; however, the “probability” of such a fall for sufficiently small \(\delta>0\) is small.
** Everything said is also valid for algorithms with an “additional parameter” of the type formulated in Theorem 3. We note that in the case of a cyclic system of inequalities \(\varphi(x,a_j)>0\), the system \((x,c_j)+\gamma_j>0\) will not, generally speaking, be cyclic; whence also follows the necessity of considering the noncyclic case.