Abstract
Full Text
UDC 519.25/26+519.3:330.115
MATHEMATICS
L. M. BREGMAN
A RELAXATION METHOD FOR FINDING A COMMON POINT OF CONVEX SETS AND ITS APPLICATION TO OPTIMIZATION PROBLEMS
(Presented by Academician L. V. Kantorovich on 26 II 1966)
Let closed convex sets (A_i) ((i \in I)) be given in a linear topological space (X). Suppose that (R=\bigcap_{i\in I} A_i) is nonempty.
Let (S) be some convex set in (X) such that (S \cap R \ne \Lambda).
Consider a function (D(x,y)), defined on (S \times S) and possessing the following properties:
(I). (D(x,y)\ge 0); (D(x,y)=0) if and only if (x=y).
(II). For each (i\in I) and (y\in S) there exists a point (x=P_i y \in A_i\cap S) such that (D(x,y)\le D(z,y)) for all (z\in A_i\cap S). We shall call this point (x) the (D)-projection of the point (y) onto the set (A_i).
(III). For each (i\in I) and (y\in S) the function
[
G(z)=D(z,y)-D(z,P_i y)
]
is convex on (A_i\cap S).
(IV). There exists the derivative (D_x'(x,y)) of the function (D(x,y)) at (x=y), and moreover
[
D_x'(y,y)=0
\quad
\left(
\text{i.e. } \lim_{t\to 0}\frac{D(y+tz,y)}{t}=0
\text{ for all } z\in X
\right).
]
(V). For each (z\in R\cap S) and for each real number (L), the set (T={x\mid D(z,x)\le L}) is compact.
(VI). If (D(x_n,y_n)\to 0), (y_n\to y^\in \bar S), and the set of elements of the sequence ({x_n}) is compact, then (x_n\to y^).
Lemma 1. Let (z\in A_i\cap S). Then for any (y\in S) the inequality holds
[
D(P_i y,y)\le D(z,y)-D(z,P_i y).
]
Proof. According to condition (III), we have
[
D(\lambda z+(1-\lambda)P_i y,y)
-
D(\lambda z+(1-\lambda)P_i y,P_i y)
\le
]
[
\le
\lambda\bigl(D(z,y)-D(z,P_i y)\bigr)
+
(1-\lambda)D(P_i y,y)
]
for all (\lambda\in[0,1]). Hence for (\lambda>0) we obtain
[
D(z,y)-D(z,P_i y)-D(P_i y,y)\ge
]
[
\ge
\frac{D(\lambda z+(1-\lambda)P_i y,y)-D(P_i y,y)}{\lambda}
-
\frac{D(\lambda z+(1-\lambda)P_i y,P_i y)}{\lambda}
\ge
]
[
\ge
-
\frac{D(\lambda z+(1-\lambda)P_i y,P_i y)}{\lambda}.
]
Passing to the limit in this inequality as (\lambda\to 0), we obtain
[
D(z,y)-D(z,P_i y)-D(P_i y,y)\ge 0.
]
The lemma is proved.
Consider the following iterative process: take an arbitrary point (x_0 \in S), then choose (i_1(x_0) \in I) and find a point (x_1)—the (D)-projection of the point (x_0) onto the set (A_{i_1(x_0)}); then choose (i_2(x_1) \in I) and find a point (x_2)—the (D)-projection of the point (x_1) onto the set (A_{i_2(x_1)}), and so on. The sequence ({x_n}) obtained as a result of this process will be called a relaxation sequence, and the set of functions ({i_1(x), i_2(x), \ldots}) will be called a relaxation control (cf. (1)).
Lemma 2. For any relaxation control:
- The set of elements of the relaxation sequence ({x_n}) is compact.
- (D(x_{n+1}, x_n) \to 0) as (n \to \infty).
- For any (z \in R \cap S) there exists (\lim\limits_{n \to \infty} D(z, x_n)).
Proof. Take (z \in R \cap S). By Lemma 1,
[
D(x_{n+1}, x_n) \leq D(z, x_n) - D(z, x_{n+1}).
\tag{1}
]
Since (D(x_{n+1}, x_n) \geq 0), it follows that (D(z, x_{n+1}) \leq D(z, x_n)), and consequently (\lim D(z, x_n)) exists; together with (1) this gives (D(x_{n+1}, x_n) \to 0). Since the set of elements of the relaxation sequence is contained in the set
[
T={x \mid D(z,x) \leq D(z,x_0)},
]
and, according to (V), the set (T) is compact, assertion 1 is true.
Let us now consider some relaxation controls.
Theorem 1. Let (I={1,2,\ldots,m}), and let the indices be chosen in cyclic order, i.e.
[
i_1(x)=1,\quad i_2(x)=2,\ldots,\quad i_m(x)=m,\quad i_{m+1}(x)=1,\ldots
]
Then any limit point (x^*) of the relaxation sequence ({x_n}) is a common point of the sets (A_i).
Proof. By Lemma 2, from the sequences ({x_{nm+i}} \subset A_i) one can extract convergent subsequences. Let (x_{n_k m+i} \to x_i^ \in A_i). According to Lemma 2,
[
D(x_{n_k m+2}, x_{n_k m+1}) \to 0.
]
By property (VI), (x_{n_k m+1} \to x_2^). Therefore, (x_1^ = x_2^). Analogously one can show that
[
x_2^ = x_3^,\quad x_3^ = x_4^,\ldots,\quad x_{m-1}^ = x_m^.
]
Consequently, the limit point
[
x^ = x_1^ = \cdots = x_m^* \in R.
]
Theorem 2. Suppose that for each (y \in S) there exists
[
\max_{i \in I} \min_{x \in A_i} D(x,y)=D(P_{i(y)}y,y)
]
and let (i_n(x)=i(x)). Then any limit point (x^*) of the relaxation sequence ({x_n}) is a common point of the sets (A_i).
Proof. Let (x_{n_k} \to x^*). Let (y_{n_k}^i = P_i x_{n_k} \in A_i). Then
[
D(y_{n_k}^i, x_{n_k}) \leq \max_{i \in I} D(y_{n_k}^i, x_{n_k}) = D(x_{n_k+1}, x_{n_k}).
]
Consequently, (D(y_{n_k}^i, x_{n_k}) \to 0) for all (i \in I). Since for any (z \in R \cap S)
[
D(z, y_{n_k}^i) \leq D(z, x_{n_k}) \leq D(z, x_0),
]
then, according to (V), the set ({y_{n_k}^i}) is compact. Consequently, by property (VI), (y_{n_k}^i \to x^) for all (i \in I). Hence it follows that (x^ \in R).
Remark. In many cases the relaxation sequence ({x_n}) has a unique limit point. This is the case, for example, if the following condition is fulfilled:
(A). The set (S) is closed, and for any (z_1,z_2 \in R \cap S) the function
[
H(y)=D(z_1,y)-D(z_2,y)
]
is continuous on (S).
In fact, let (x^) and (x^{}) be limit points of the relaxation sequence ({x_n}) and let (x^, x^{} \in R). Let (x_{n_k} \to x^*), (x_{n_l} \to x^{}). By Lemma 2, there exists
[
\lim H(x_n)=\lim \bigl(D(x^,x_n)-D(x^{*},x_n)\bigr),
]
[
\lim H(x_{n_k})=-D(x^{*},x^)\leq 0,
]
[
\lim H(x_{n_l})=D(x^,x^{*})\geq 0.
]
Since (\lim H(x_{n_k})=\lim H(x_{n_l})=\lim H(x_n)), it follows that (D(x^{},x^*)=0), and hence (x^{}=x^*).
Let us consider some examples of functions satisfying conditions (I)—(VI).
-
Let (X) be a real Hilbert space. The function
[
D(x,y)=(x-y,x-y)
]
satisfies conditions (I)—(VI) and condition (A) for any system of closed convex sets (A_i), if convergence is understood as weak convergence, and compactness as weak compactness. In this case the (D)-projection onto a convex set coincides with the usual projection. The corresponding relaxation sequence ({x_n}) under the conditions of Theorem 1 or 2 will converge weakly to an element (x^*\in R). In this case the relaxation method coincides with the method of successive projection ((^2)). -
Let (f(x)) be a strictly convex twice differentiable function defined on some convex closed set (S\subset E^n). Let (g(x)) be the gradient of this function at the point (x). Consider the function
[
D(x,y)=f(x)-f(y)-(g(y),x-y).
]
If this function satisfies condition (V), then, as is easy to verify, it satisfies conditions (I)—(IV), (VI) and (A) for any system of closed convex sets (A_i).
Suppose the following conditions are fulfilled:
1) The absolute minimum of the function (f(x)) is attained at a point (x_0) lying in the interior of (S).
2) The sets (A_i) are hyperplanes, i.e.
[
A_i=\left{x\in E^n\ \middle|\ \sum_{j=1}^{n} a_{ij}x_j=b_i\right},
]
and the set (I) is finite.
3) If (y) belongs to the interior of (S), then the (D)-projection of the point (y) onto any set (A_i) also belongs to the interior of (S).
In this case one can show that the corresponding relaxation sequence with initial approximation (x_0) converges to the point (x^*), which delivers the minimum of the function (f(x)) under the conditions
[
\sum_{j=1}^{n} a_{ij}x_j=b_i
]
((i\in I)).
- Let
[
D(x,y)=\sum_{j=1}^{n}\bigl(y_j-x_j+x_j(\ln x_j-\ln y_j)\bigr)
]
be given on the set
[
{x\geq 0,\ y>0}.
]
This function satisfies conditions (I)—(VI), but does not satisfy condition (A). Nevertheless, in this case too the relaxation sequence will have a unique limit point.
Indeed, if (x^\in R) is a limit point of the sequence ({x_n}) and (x_{n_k}\to x^), then (\lim D(x^,x_{n_k})=0), and since (\lim D(x^,x_n)) exists, by property (VI) (x_n\to x^*).
If
[
A_i=\left{x \,\middle|\, \sum_{j=1}^{n} a_{ij}x_j=b_i\right},\qquad
\left(\sum_{j=1}^{n} a_{ij}^{2}>0\right),
]
then (x^*) maximizes
[
\sum_{j=1}^{n} x_j \ln \frac{p_j}{x_j}\quad (p_j>0)
]
under the conditions
[
\sum_{j=1}^{n} a_{ij}x_j=b_i,\qquad x_j \geqslant 0,
]
if the vector (x_0={p_j/e}) is taken as the initial approximation. The (D)-projection (x) of the point (y) onto the set (A_i) is found from the formulas (x_j=y_j e^{\lambda a_{ij}}), where (\lambda) is the unique root of the equation
[
\sum_{j=1}^{n} a_{ij}y_j e^{\lambda a_{ij}}=b_i .
]
This equation becomes linear with respect to (e^\lambda) if all (a_{ij}) are equal to 0 or 1.
The latter occurs, for example, in the following problem: maximize
[
\sum_{i=1}^{m}\sum_{j=1}^{n} x_{ij}\ln \frac{p_{ij}}{x_{ij}}\quad (p_{ij}\geqslant 0)
]
under the conditions
[
\sum_{j=1}^{n} x_{ij}=a_i,\qquad \sum_{i=1}^{m} x_{ij}=b_j;
]
(x_{ij}\geqslant 0); (x_{ij}=0) if (p_{ij}=0). Such problems arise in the calculation of passenger flows in cities, and the corresponding relaxation method for this problem coincides with the method of G. V. Sheleikhovsky (see ((^3))).
Leningrad State University
named after A. A. Zhdanov
Received
13 II 1966
REFERENCES
(^1) D. K. Faddeev, V. N. Faddeeva, Computational Methods of Linear Algebra, Moscow, 1963. (^2) L. M. Bregman, DAN, 162, No. 3, 487 (1965). (^3) A. G. Dynkin, E. G. Movchan, in: Application of Mathematical Methods and Electronic Computers in Urban Planning, Kiev, 1966.