Full Text
Yu. I. Lyubich
GENERAL THEOREMS ON QUADRATIC RELAXATION
(Presented by Academician L. V. Kantorovich on 20 X 1964)
Mathematics
Consider the equation
\[ Ax=f \tag{1} \]
with a positive definite bounded operator \(A\) in a real Hilbert space. It is known (see, for example, \((^1)\), pp. 12–13) that equation (1) is equivalent to the problem of minimizing the potential function
\[
\varphi(x)=(Ax,x)-2(f,x).
\]
This circumstance underlies a class of computational processes that are now commonly called relaxation processes (see, for example, \((^2)\), p. 219). Numerous works have been devoted to the theory of relaxation (see the bibliography in \((^2)\)). However, apparently, until now no investigation of relaxation processes in general has been carried out, i.e., independently of the concrete structure of the process. Nevertheless, the fundamental work of A. M. Ostrowski \((^3)\) and some works prompted by his article \((^4,^5)\) to some extent implement a general approach. The present work is an attempt, developing the system of concepts and the technique of the “abstract” theory of relaxation for equation (1) that are available in \((^{3-5})\) (see also \((^2)\)), to outline such a theory. A characteristic feature of our exposition consists in the separation of the factors responsible for the various properties of the process (relaxationality, convergence, rate of convergence). Substantially new results are the necessity assertions in Theorems 2, 3 and their corollaries, and Theorem 4.
We shall regard the operator \(A\) with the properties indicated above and the vector \(f\) in (1) as fixed. Denote the solution of the equation by \(x^0\). A relaxation sequence (r.s.) for equation (1) is a sequence of vectors \(\{x_k\}_0^\infty\) for which
\[ \varphi(x_k)\geq \varphi(x_{k+1}) \quad (k=0,1,2,3,\ldots). \tag{2} \]
The vector \(x_k\) will be called the \(k\)-th approximation (to the solution \(x^0\)). We shall say that an r.s. converges completely if it converges to the vector \(x^0\).
Let \(\{x_k\}_0^\infty\) be an r.s. Put
\[ x_{k+1}-x_k=-\alpha_k y_k, \]
where \(\alpha_k=\|x_{k+1}-x_k\|\geq 0,\ \|y_k\|=1\). Since
\[ \Delta\varphi(x_k)\equiv \varphi(x_{k+1})-\varphi(x_k) = -2\alpha_k(u_k,y_k)+\alpha_k^2(Ay_k,y_k), \tag{3} \]
where \(u_k=Ax_k-f\) is the residual vector of the \(k\)-th approximation, property (2) is equivalent to the inequality
\[ \alpha_k^2(Ay_k,y_k)-2\alpha_k(u_k,y_k)\leq 0, \]
which, in turn, is equivalent to the representability of \(\alpha_k\) in the form
\[ \alpha_k=q_k\frac{(u_k,y_k)}{(Ay_k,y_k)}. \tag{4} \]
where* \(0 \le q_k \le 2\). The dimensionless coefficient \(q_k\) is called the relaxation multiplier. Next, introduce the angle \(\theta_k\) \((0 \le \theta_k \le \pi/2)\), setting**
\[ (u_k,y_k)=\|u_k\|\cos\theta_k, \tag{5} \]
and call it the relaxation angle. In view of (3)—(5),
\[ \Delta\varphi(x_k)=-\,\frac{q_k(2-q_k)\cos^2 Q_k}{(Ay_k,y_k)}\|u_k\|^2. \]
Denoting \(\Delta_k \equiv (u_k,A^{-1}u_k)\), we arrive at the formula
\[ \Delta\varphi(x_k)=-q_k(2-q_k)\cos^2\theta_k\Delta_k h_k, \tag{6} \]
where \(h_k=\|u_k\|^2/(Ay_k,y_k)(u_k,A^{-1}u_k)\). Introducing the condition number \(h(A)=\|A\|\|A^{-1}\|\), we shall have the obvious estimate
\[ [h(A)]^{-1}\le h_k\le h(A). \tag{7} \]
Since, by (2), there exists
\[ \varphi_\infty \equiv \lim_{k\to\infty}\varphi(x_k) =\varphi(x_0)+\sum_{k=0}^{\infty}\Delta\varphi(x_k), \]
it follows, by (6) and (7), that
Theorem 1. For any r.p. the series converges
\[ \sum_{k=0}^{\infty} q_k(2-q_k)\cos^2\theta_k\Delta_k . \tag{8} \]
For the remainder of the series (8) it is easy to obtain a two-sided estimate. Namely, summing (6) over \(k\ge n\) \((n=0,1,2,\ldots)\) and taking (7) into account, we arrive at the inequality
\[ [h(A)]^{-1}(\varphi(x_n)-\varphi_\infty) \le \sum_{k=n}^{\infty} q_k(2-q_k)\cos^2\theta_k\Delta_k \le h(A)(\varphi(x_n)-\varphi_\infty). \tag{9} \]
We now note that \(\varphi(x)=(A(x-x^0),x-x^0)-(Ax^0,x^0)\), and therefore
\[ \varphi(x_n)=\Delta_n-(Ax^0,x^0). \tag{10} \]
Putting
\[ \delta=\sqrt{\varphi_\infty+(Ax^0,x^0)}, \tag{11} \]
we give estimate (9) the form
\[ [h(A)]^{-1}(\Delta_n-\delta^2) \le \sum_{k=n}^{\infty} q_k(2-q_k)\cos^2\theta_k\Delta_k \le h(A)(\Delta_n-\delta^2). \tag{12} \]
In order that an r.p. converge completely, it is necessary and sufficient that the residual tend to zero. This is equivalent to the quantity \(\Delta_n\) tending to zero. By (10) and (2), the sequence \(\{\Delta_n\}_0^\infty\) is monotone, and, evidently,
\[ \lim_{n\to\infty}\Delta_n=\delta^2. \tag{13} \]
Thus, an r.p. converges completely if and only if \(\delta=0\). The quantity \(\delta\) is naturally called the residual \(A\)-displacement.
If an r.p. converges completely, then inequality (12) takes the simpler form
\[ [h(A)]^{-1}\Delta_n \le \sum_{k=n}^{\infty} q_k(2-q_k)\cos^2\theta_k\Delta_k \le h(A)\Delta_n. \tag{14} \]
* For \((u_k,y_k)=0\) we take \(q_k=1\).
** For \(u_k=0\) we take \(\theta_k=0\).
With the aid of inequality (14) one obtains a criterion for the complete convergence of the r.p. in terms of the dimensionless parameters \(q_k, \theta_k\).
Theorem 2. In order that the r.p. converge completely, it is necessary and sufficient that
\[ \sum_{k=0}^{\infty} q_k (2-q_k)\cos^2\theta_k=\infty . \tag{15} \]
Necessity. From (14), by virtue of the monotonicity of \(\{\Delta_n\}_0^\infty\), it follows that
\[ \sum_{k=n}^{\infty} q_k (2-q_k)\cos^2\theta_k \ge [h(A)]^{-1}\quad (n=0,1,2,\ldots). \]
Sufficiency. By virtue of Theorem 1 and the monotonicity of \(\{\Delta_n\}_0^\infty\), (15) implies that \(\lim\limits_{n\to\infty}\Delta_n=0\).
The corollaries of Theorem 2 listed below are obtained directly.
Corollary 1. If the r.p. converges completely, then
\[ \sum_{k=0}^{\infty} q_k=\infty,\qquad \sum_{k=0}^{\infty} (2-q_k)=\infty,\qquad \sum_{k=0}^{\infty} \cos^2\theta_k=\infty . \tag{16} \]
We shall call an r.p. quasigradient if \(\inf\limits_{k\ge 0}\cos\theta_k>0\) (gradient if \(\cos\theta_k=1\) \((k=0,1,2,\ldots)\)). We shall call an r.p. strictly relaxation (abbreviated s.r.p.) if \(\inf\limits_{k\ge 0} q_k>0,\ \sup\limits_{k\ge 0} q_k<2\).
Corollary 2. In order that a quasigradient r.p. converge completely, it is necessary and sufficient that
\[ \sum_{k=0}^{\infty} q_k(2-q_k)=\infty . \tag{17} \]
Corollary 3. In order that an s.r.p. converge completely, it is necessary and sufficient that
\[ \sum_{k=0}^{\infty} \cos^2\theta_k=\infty . \tag{18} \]
Corollary 4. Every quasigradient s.r.p. converges completely.
Let us investigate the rate of convergence of an r.p. As a characteristic of the rate of convergence, introduce the quantity
\[ \chi=\lim_{n\to\infty}\frac{1}{n}\sum_{k=0}^{n-1}\frac{\Delta_{k+1}}{\Delta_k}, \]
which we shall call the arithmetic denominator of convergence. Clearly, always \(0\le \chi\le 1\). If \(\chi<1\), then the r.p. converges completely, and we shall say that the r.p. converges normally*. Since, obviously,
\[ \eta=\lim_{n\to\infty}\sqrt[n]{\Delta_n}\le \chi, \]
normal convergence entails the decrease of \(\Delta_n\) at least at the rate of a geometric progression with denominator** \(\chi\).
* If \(\Delta_k=0\) for some \(k\), then \(\Delta_j=0\) \((j\ge k)\), and we shall set \(\chi=0\).
* The quantity \(\eta\) is naturally called the geometric denominator of convergence*.
Theorem 3. In order that an r.p. converge normally, it is necessary and sufficient that
\[ \sigma \equiv \lim_{n\to\infty} \frac{1}{n}\sum_{k=0}^{n-1} q_k(2-q_k)\cos^2\theta_k > 0 . \tag{19} \]
This theorem follows immediately from the following general fact.
Theorem 4. For any r.p. the inequality
\[ 1-h(A)\sigma \leq \chi \leq 1-[h(A)]^{-1}\sigma . \tag{20} \]
holds.
For the proof of Theorem 4, let us note that, by virtue of (6) and (10),
\[ q_k(2-q_k)\cos^2\theta_k \Delta_k h_k = \Delta_k-\Delta_{k+1}\quad (k=0,1,2,\ldots), \tag{21} \]
whence, using (7), we obtain
\[ [h(A)]^{-1}\left(1-\frac{\Delta_{k+1}}{\Delta_k}\right) \leq q_k(2-q_k)\cos^2\theta_k \leq h(A)\left(1-\frac{\Delta_{k+1}}{\Delta_k}\right). \]
Summation of these inequalities with respect to \(k\) gives, after simple transformations,
\[ 1-\frac{h(A)}{n}\sum_{k=0}^{n-1} q_k(2-q_k)\cos^2\theta_k \leq \frac{1}{n}\sum_{k=0}^{n-1}\frac{\Delta_{k+1}}{\Delta_k} \leq 1-\frac{1}{h(A)n}\sum_{k=0}^{n-1} q_k(2-q_k)\cos^2\theta_k, \]
and hence, as \(n\to\infty\), inequality (20) follows.
Theorem 3 may be regarded as a “quantitative” variant of Theorem 2. Accordingly, from Theorem 3 there follow “quantitative” variants of the corollaries of Theorem 2.
Corollary 1. If an r.p. converges normally, then
\[ \lim_{n\to\infty}\frac{1}{n}\sum_{k=0}^{n-1}q_k>0,\qquad \lim_{n\to\infty}\frac{1}{n}\sum_{k=0}^{n-1}(2-q_k)>0,\qquad \lim_{n\to\infty}\frac{1}{n}\sum_{k=0}^{n-1}\cos^2\theta_k>0 . \tag{22} \]
Corollary 3. In order that an s.r.p. converge normally, it is necessary and sufficient that
\[ \lim_{n\to\infty}\frac{1}{n}\sum_{k=0}^{n-1}q_k(2-q_k)>0 . \tag{23} \]
Corollary 3. In order that an s.r.p. converge normally, it is necessary and sufficient that
\[ \lim_{n\to\infty}\frac{1}{n}\sum_{k=0}^{n-1}\cos^2\theta_k>0 . \tag{24} \]
Corollary 4. Every quasigradient s.r.p. converges normally.
Kharkov State University
named after A. M. Gorky
Received
20 X 1964
REFERENCES CITED
- S. G. Mikhlin, The Problem of the Minimum of a Quadratic Functional, 1952.
- D. K. Faddeev, V. N. Faddeeva, Computational Methods of Linear Algebra, Moscow, 1963.
- A. M. Ostrowski, Rend. Math. e Appl. (5), 14, 140 (1954).
- S. Schechter, Comm. Pure and Appl. Math., 12, No. 2, 313 (1959).
- A. S. Kelbasinskii, Vestn. Mosk. Univ., No. 5, 40 (1960).