UDC 512.25/.26+519.3:330.115
CYBERNETICS AND CONTROL THEORY
Submitted 1966-01-01 | RussiaRxiv: ru-196601.23776 | Translated from Russian

Full Text

UDC 512.25/.26+519.3:330.115

CYBERNETICS AND CONTROL THEORY

I. I. EREMIN, Vl. D. MAZUROV

AN ITERATIVE METHOD FOR SOLVING A CONVEX PROGRAMMING PROBLEM

(Presented by Academician A. I. Mal’tsev, 8 XII 1965)

We consider an iterative method for solving a convex programming problem, i.e., the problem of minimizing a linear function \((c,x)\) on a closed convex set \(M \subset R^n\). The set \(M\) may be specified, in particular, as the set of solutions of a system of inequalities

\[ f_j(x) \leq 0 \qquad (j=1,2,\ldots,l), \tag{1} \]

where \(f_j(x)\) are convex functions defined on \(R^n\).

I. An operator \(\varphi\), defined on \(R^n\) with range in \(R^n\), will be called Fejér with respect to the closed convex set \(M\) if, for \(p \notin M\), the inequality \(|\varphi(p)-y|<|p-y|\) holds for all \(y \in M\), and \(\varphi(p)=p\) for \(p \in M\). Let us indicate some properties of the sequence \(\{\varphi^k(p)\}\) generated by the point \(p\) and the operator \(\varphi\):

\(1^\circ\). If the set \(M\) is \(n\)-dimensional, then \(\{\varphi^k(p)\} \to p' \in R^n\).

\(2^\circ\). If at least one limit point \(p'\) of the sequence \(\{\varphi^k(p)\}\) belongs to \(M\), then \(\{\varphi^k(p)\} \to p'\).

\(3^\circ\). If the operator \(\varphi\) is continuous, then \(\{\varphi^k(p)\} \to p' \in M\).

We note that property \(2^\circ\) characterizes an arbitrary Fejér sequence \(\{p_k\}\) (with respect to \(M\)), i.e., one such that \(|p_{k+1}-y|<|p_k-y|\) for all \(k\) and \(y \in M\).

A Fejér operator \(\varphi\) (with respect to \(M\)) will be called \(c\)-Fejér if, for any \(p \in R^n\), the relation \(\{\varphi^k(p)\}\to p' \in M\) is valid. Consequently, a continuous Fejér operator is \(c\)-Fejér.

We give examples of \(c\)-Fejér mappings associated with the set \(M\) of solutions of the system of inequalities (1) (^1). We shall assume the functions \(f_j(x)\) in system (1) to be smooth.

Let \(d(x)\) be a real continuous convex function defined on \(R^n\) and having the property \(\{x\mid d(x)\leq 0\}=M\); let \(e(x)\) be a vector function that does not take zero values outside \(M\) and is bounded on every bounded set.

We shall say that the functions \(d(x)\) and \(e(x)\) satisfy condition \((*)\) if, for \(p \notin M\), the half-space corresponding to the inequality \((e(p),x-p)+d(p)\leq 0\) contains \(M\). We give examples of defining functions \(d(x)\) and \(e(x)\) satisfying condition \((*)\):

1) \(d(x)=\max_j f_j(x)\), \(e(x)=\nabla f_{j_x}(x)\), where \(\nabla\) denotes the gradient, \(j_x \in I(x)=\{k\mid d(x)=f_k(x)\}\);

2)

\[ d(x)=\sum_{j\in s(x)} k_j f_j(x), \qquad e(x)=\sum_{j\in s(x)} k_j \nabla f_j(x), \]

where \(k_j\) \((j=1,2,\ldots,l)\) is a system of positive constants, \(s(x)=\{k\mid f_k(x)>0\}\);

3)

\[ d(x)=\sum_{j\in s(x)} f_j^2(x), \qquad e(x)=\sum_{j\in s(x)} f_j(x)\nabla f_j(x) \]

(here, as in 2), we set \(d(x)=0\) if \(s(x)\) is empty).

If the functions \(d(x)\) and \(e(x)\) satisfy condition \((*)\), then the mapping
\[ \varphi:\varphi(p)=p-\lambda \frac{d(p)}{|e(p)|^2}e(p) \]
for \(d(p)>0\), and \(\varphi(p)=p\) if \(d(p)\leq 0\) \((0<\lambda<2)\), is \(c\)-Fejér with respect to the set \(M\). If, moreover, the role of \(M\) is played by the set of solutions of the system of inequalities (1), then the choice of the functions \(d(x)\) and \(e(x)\) may be made by any of the methods listed above.

The mapping
\[ \varphi_j:\ \varphi_j(p)=p-\lambda \frac{f_j(p)}{|\nabla f_j(p)|^2}\nabla f_j(p) \]
for \(f_j(p)>0\), and \(\varphi_j(p)=p\) if \(f_j(p)\leq 0\), \(0<\lambda<2\), is a continuous Fejér mapping with respect to
\[ M_j=\{x\mid f_j(x)\leq 0\} \]
\((^1)\). Hence it follows that the mapping
\[ \varphi=\varphi_l\varphi_{l-1}\cdots\varphi_1, \]
being a continuous Fejér mapping with respect to
\[ \bigcap_j M_j=M, \]
will be \(c\)-Fejér with respect to \(M\).

II. Let us proceed to the consideration of problem C: find
\[ \tilde m=\operatorname{Inf}_{x\in M}(c,x), \]
where \(M\) is some convex closed set in \(R^n\). The optimal set
\[ \tilde M=\{x\mid (c,x)=\tilde m\} \]
will be assumed nonempty and bounded (condition C3, \((^2)\)). Note that
\[ \tilde M=M\cap P=M\cap \bar P, \]
where \(P\) is the half-space corresponding to the inequality \((c,x)\leq \tilde m\), and \(\bar P\) is its boundary hyperplane. The vector \(c\) is assumed normalized.

Let \(\varphi\) be an arbitrary \(c\)-Fejér operator with respect to the set \(M\), and let \(d(x)\) be some continuous convex function for which
\[ \{x\mid d(x)\leq 0\}=M. \]

Define a sequence \(\{p_k\}\) (for arbitrary \(p_0\in R^n\)) by the relation
\[ p_{k+1}=\varphi^{n_k}(p_k)-\lambda_k c, \tag{2} \]
where \(n_k\) is a natural number, chosen in one way or another for each \(k\); \(\lambda_k\in(0,\lambda_0]\), \(\lambda_0>0\). Here the choice of \(n_k\) will be determined by the condition
\[ d\left[\varphi^{n_k}(p_k)\right]\leq \lambda_k \tag{3} \]
(since \(\varphi\) is a \(c\)-Fejér operator, such a choice of \(n_k\) is possible).

Let
\[ S(\varepsilon)=\{x\mid (c,x)-\tilde m\leq \varepsilon,\ d(x)\leq \varepsilon\},\qquad \varepsilon>0. \]
The set \(S(\varepsilon)\) is bounded (this follows from (2), theorem 1, Ch. 7), and moreover \(S(\varepsilon)\to \tilde M\) as \(\varepsilon\to 0\). Introduce the notation
\[ \delta(\varepsilon)=\sup_{x\in S(\varepsilon)} |x-\tilde M| \]
and
\[ D(\varepsilon)=\delta(\varepsilon)+2\varepsilon \]
(\(|x-\tilde M|\) is the distance from the point \(x\) to \(\tilde M\)). Note the obvious relation
\[ D(\varepsilon)\to 0\quad \text{as }\varepsilon\to 0. \tag{4} \]

Theorem. Suppose that for problem C condition C3 is satisfied. If the sequence \(\{p_k\}\) is defined by relation (2), with \(n_k\) chosen in accordance with (3), then:

1) for \(\lambda_k=\varepsilon>0\) there exists a natural number \(N_0\) such that for \(k\geq N_0\)
\[ |p_k-\tilde M|\leq D(\varepsilon), \]
and therefore
\[ |(c,p_k)-\tilde m|\leq D(\varepsilon); \]

2) for \(\lambda_k\to 0\) and \(\sum \lambda_k=+\infty\),
\[ |p_k-\tilde M|\to 0 \]
and, consequently,
\[ (p_k,c)\to \tilde m \]
as \(k\to +\infty\).

Proof. Consider three cases.

Case 1. Suppose that for almost all \(k\) (i.e., starting from some \(k_0\)) \(p_k\in P\). Let
\[ \bar p_k=\varphi^{n_k}(p_k). \]
Since \(d(\bar p_k)\leq \lambda_k\) (see (3)) and
\[ (c,\bar p_k)-\tilde m=(c,p_{k+1}+\lambda_k c)-\tilde m=(c,p_{k+1})-\tilde m+\lambda_k\leq \lambda_k, \]
we have \(\bar p_k\in S(\lambda_k)\). Therefore
\[ |p_{k+1}-\tilde M|\leq |\bar p_k-\tilde M|+\lambda_k\leq \delta(\lambda_k)+\lambda_k<D(\lambda_k), \qquad k\geq k_0. \]
From this relation the validity of both assertions of the theorem for the case under consideration is easily seen.

Case 2. Suppose that for \(k\geq k_0\) (\(k_0\) some natural number) \(p_k\notin P\). We first prove the inequality
\[ \lambda_k\left(2|\bar p_k-P|-\lambda_k\right)\leq |p_k-\tilde M|^2-|p_{k+1}-\tilde M|^2,\qquad k\geq k_0. \tag{5} \]
Denote by \(q_k\) the projection of the point \(\bar p_k\) onto \(\tilde M\). We have
\[ |p_{k+1}-\tilde M|^2\leq |p_{k+1}-q_k|^2 =|\bar p_k-\lambda_k c-q_k|^2 =|\bar p_k-q_k|^2+\lambda_k^2-2\lambda_k(\bar p_k-q_k,c). \]

Further, since \(\varphi\) is a Fejér operator, \(|\bar p_k-q_k|\le |p_k-q_k|\). Let us also note that \((\bar p_k-q_k,c)=|\bar p_k-P|\) (the latter follows from the fact that \(q_k\in \bar P\), \(|c|=1\)). We can now rewrite the inequality obtained above in the form
\[ |\bar p_{k+1}-\widetilde M|^2\le |p_k-\widetilde M|^2+\lambda_k^2-2\lambda_k|\bar p_k-P|. \]
But it differs only in form from what was to be proved.

Since \(p_{k+1}\notin P\), \(k\ge k_0\), we have \(|p_{k+1}-y|<|\bar p_k-y|\) for any \(y\in P\supset \widetilde M\). Moreover, \(|\bar p_k-y|\le |p_k-y|\) for arbitrary \(y\in \widetilde M\). Therefore the sequence \(\{p_k,\bar p_k\}\), \(k\ge k_0\), is a Fejér sequence (if possible repetitions of its terms are disregarded) with respect to \(\widetilde M\), and consequently is bounded.

We shall show next that, when \(\lambda_k=\varepsilon>0\), the case 2 under consideration is impossible. Indeed, from (5) it follows that
\[ \sum_{k_0}^{N}\varepsilon(2|\bar p_k-P|-\varepsilon)\le |p_{k_0}-\widetilde M|^2-|p_{N+1}-\widetilde M|^2,\quad N>k_0. \]
Since \(p_{k+1}=\bar p_k-\varepsilon c\), and \(p_{k+1}\notin P\), \(k\ge k_0\), it is obvious that \(|\bar p_k-P|>|\bar p_k-p_{k+1}|=\varepsilon\). Hence \(\varepsilon<2|\bar p_k-P|-\varepsilon\), and consequently
\[ (N-k_0)\varepsilon^2\le \sum_{k_0}^{N}\varepsilon(2|\bar p_k-P|-\varepsilon)\le |p_{k_0}-\widetilde M|^2+|p_{N+1}-\widetilde M|^2. \]
By boundedness of \(|p_{N+1}-\widetilde M|^2\) (uniformly in \(N\)), the last inequality is contradictory for sufficiently large \(N\).

Let now the conditions \(\lambda_k\to 0\) and \(\sum\lambda_k=+\infty\) be satisfied. Since \(d(\bar p_k)\le \lambda_k\), it follows that
\[ |\bar p_k-M|\to 0\quad\text{as } k\to+\infty. \tag{6} \]
If, moreover, \(\operatorname{Inf}_{k}|\bar p_k-P|=0\), then one of the limit points \(p'\) of the Fejér sequence \(\{\bar p_k\}\) will lie in \(P\), and therefore, by virtue of (6), also in \(P\cap M=\widetilde M\). But then \(p'\) will be the unique limit point for the sequence \(\{p_k,\bar p_k\}\), and \(|p_k-\widetilde M|\to |p'-\widetilde M|=0\), as required. Let us show that the relation \(\operatorname{Inf}_{k}|\bar p_k-P|=\alpha>0\) is impossible. From (5) it follows that
\[ \sum_{k_0}^{N}\lambda_k(2|\bar p_k-P|-\lambda_k)\le |p_{k_0}-\widetilde M|^2-|p_{N+1}-\widetilde M|^2,\quad k_0<N. \tag{7} \]
Since \(\lambda_k\to 0\), starting from sufficiently large \(k\) the relation \(2|\bar p_k-P|-\lambda_k\ge \alpha/2\) will hold. The latter leads to the conclusion that the series \(\sum_k \lambda_k(2|\bar p_k-P|-\lambda_k)\) diverges (the series \(\sum_k\lambda_k\) diverges!). But this contradicts the fact that the right-hand side of inequality (7) is bounded (uniformly in \(N\)). The consideration of case 2 is complete.

Case 3. It remains for us to consider the case when infinitely many elements of the sequence \(\{p_k\}\) are contained both in \(P\) and outside \(P\). In the sequence \(\{p_k\}\), select the segments \(\Delta_s=\{p_{s'},\ldots,p_{s''}\}\) according to the following property: \(p_k\in P\) for \(p_k\in\Delta_s\), but \(p_{s''+1}\notin P\). Denote by \(\overline{\Delta}_s\) the segment of elements between \(\Delta_s\) and \(\Delta_{s+1}\). Let now \(p_{k+1}\) be an arbitrary element of \(\{p_t\}\), and let \(s\) be the number for which either \(p_{k+1}\in\Delta_s\), or \(p_{k+1}\in\overline{\Delta}_s\). If \(p_{k+1}\in\Delta_s\), then it is easy to see the inclusion \(\bar p_k\in S(\lambda_k)\), and therefore
\[ |p_{k+1}-\widetilde M|\le \delta(\lambda_k)+\lambda_k. \tag{8} \]
If, however, \(p_{k+1}\in\overline{\Delta}_s\), then
\[ |p_{k+1}-\widetilde M|\le |p_{s''+1}-\widetilde M|\le |\bar p_{s''}-\widetilde M|+\lambda_{s''}, \]
and consequently, since \(p_{s''}\in\Delta_s\),
\[ |p_{k+1}-\widetilde M|\le \delta(\lambda_{s''-1})+\lambda_{s''-1}+\lambda_{s''}. \]
The inequality obtained, together with (8), makes it possible to conclude that the theorem is valid also for the case 3 under consideration.

Let us note that the method described, as applied to the problem of linear programming and for a certain particular choice of the operator \(\varphi\), is close in idea to the method of [3].

III. In practical problems of convex (in particular, linear) programming, it is most often necessary to find an optimal solution among the nonnegative solutions of system (1). Of course, one could include in system (1) the inequalities \(-x_i \leqslant 0\) \((i = 1, \ldots, n)\), where \((x_1, \ldots, x_n)=x\), and apply the method described above to the resulting system. However, it would be desirable to take the condition \(x \geqslant 0\) into account in the method more economically, and this is in fact possible. It suffices to replace relation (2) by

\[ p_{k+1}=\bigl[\varphi^{n_k}(p_k)-\lambda_k c\bigr]^+; \]

here the plus sign over the vector denotes replacing its negative coordinates by zeros. The sequence \(\{p_k\}\) then solves the problem posed.

Sverdlovsk Branch
of the V. A. Steklov Mathematical Institute
Academy of Sciences of the USSR Received
6 XII 1965

REFERENCES CITED

\(^{1}\) I. I. Eremin, DAN, 160, No. 5, 994 (1965).
\(^{2}\) G. Zoutendijk, Methods of Feasible Directions, IL, 1963.
\(^{3}\) V. A. Bulavskii, DAN, 137, No. 2, 258 (1961).

Submission history

UDC 512.25/.26+519.3:330.115