Full Text
MATHEMATICS
L. A. KIVISTIK
ON A MODIFICATION OF THE ITERATIVE METHOD WITH MINIMAL RESIDUALS FOR SOLVING NONLINEAR OPERATOR EQUATIONS
(Presented by Academician S. L. Sobolev on 15 VII 1960)
- Let \(P(x)\) be a twice differentiable (in the Fréchet sense) operator from the real Hilbert space \(H\) into the same space. For solving the equation
\[ P(x)=0 \tag{1} \]
we consider iterative methods of the type
\[ x_{n+1}=x_n+\varepsilon_n y_n,\qquad n=0,1,\ldots, \tag{2} \]
where \(x_0\in H\) is a known initial approximation to the solution of equation (1), \(\varepsilon_n\) are real numbers, and \(\{y_n\}\) is some sequence of elements of the space \(H\).
Since \(P(x)\) is twice differentiable, we have
\[ \|P(x_{n+1})\|\leq \|P(x_n)+P'(x_n)(x_{n+1}-x_n)\| +\frac{1}{2}\|P''(\bar x_n)\|\,\|x_{n+1}-x_n\|^2, \tag{3} \]
where \(\bar x_n=x_n+\tau_n(x_{n+1}-x_n)\), \(0<\tau_n<1\).
Choose \(\varepsilon_n\) so that, for fixed \(y_n\),
\[ \|P(x_n)+P'(x_n)(x_{n+1}-x_n)\|^2 = \|P(x_n)+\varepsilon_n P'(x_n)y_n\|^2 \]
is minimal. It is easy to verify that the minimum value is attained for
\[ \varepsilon_n = - \frac{(P(x_n),\,P'(x_n)y_n)} {\|P'(x_n)y_n\|^2}. \tag{4} \]
In this case
\[ \|P(x_n)+P'(x_n)(x_{n+1}-x_n)\|^2 = \|P(x_n)\|^2 - \frac{(P(x_n),\,P'(x_n)y_n)^2} {\|P'(x_n)y_n\|^2}. \tag{5} \]
- Choose \(y_n=P(x_n)\). Then, for solving equation (1), we obtain the method
\[ x_{n+1} = x_n - \frac{(P(x_n),\,P'(x_n)P(x_n))} {\|P'(x_n)P(x_n)\|^2} \,P(x_n), \tag{6} \]
which, in the case of a linear operator equation, gives the method with minimal residuals considered by M. A. Krasnosel’skii and S. G. Krein \((^1)\).
Concerning the convergence of method (6), the following theorem holds (cf. \((^2)\)):
Theorem 1. Suppose the following conditions are satisfied:
\(1^\circ.\ \|P(x_0)\|\leq \delta_0;\)
2°. For all \(x\in S(x_0,r)^1\), where \(r=\dfrac{M\delta_0}{1-q}\), the following estimates hold:
a) \(\|P'(x)\|\leq A\);
b) \(\|P''(x)\|\leq B\);
c) \(\bigl(P'(x)h,h\bigr)\geq M^{-1}\|h^2\|\) for all \(h\in H\) \((M>0)\).
3°. \(q=\sqrt{1-b^{-1}+\tfrac12 a_0}<1\), where \(b=M^2A^2,\ a_0=M^2B\delta_0\).
Then equation (1) has in the sphere \(S(x_0,r)\) a unique solution \(x^*\), to which the sequence \(\{x_n\}\) obtained by method (6) converges, and the estimates
\[ \|x^*-x_n\|\leq M\|P(x_n)\|\leq M\delta_0 q^n . \tag{7} \]
hold.
Proof. Since \(\|P''(\bar x_0)\|\|x_1-x_0\|^2\leq BM^2\delta_0^2=a_0\delta_0\), taking into account (5) and the assumptions of the theorem, we obtain from (3)
\[ \|P(x_1)\|\leq \left(\sqrt{1-b^{-1}}+\tfrac12 a_0\right)\|P(x_0)\| = q\|P(x_0)\|. \]
This means that there exists a constant \(\delta_1\) satisfying the inequalities
\(\|P(x_1)\|\leq \delta_1\leq q\|P(x_0)\|\leq q\delta_0<\delta_0\).
It is easy to verify that \(*\ S(x_1,r_1)\subset S(x_0,r)\), where
\(r_1=\dfrac{M\delta_1}{1-q_1}\), \(q_1=\sqrt{1-b^{-1}+\tfrac12 a_1}<q_1\), \(a_1=M^2B\delta_1\). Thus all the assumptions are satisfied at \(x_1\), and we may continue computing successive approximations. By mathematical induction we obtain, for all \(n=0,1,\ldots\),
\[ \|P(x_{n+1})\|\leq \delta_{n+1}\leq q\|P(x_n)\|,\qquad \|x_{n+1}-x_n\|\leq M\|P(x_n)\|. \]
Using these inequalities, we obtain for all \(n\) and \(p\)
\[ \|x_{n+p}-x_n\| \leq M\bigl(\|P(x_{n+p-1})\|+\cdots+\|P(x_n)\|\bigr) \leq \frac{M\delta_0}{1-q}q^n . \]
This proves the existence of the limit \(\lim_{n\to\infty}x_n=x^*\in S(x_0,r)\). Since the operator \(P(x)\) is continuous, then
\(\|P(x^*)\|=\lim\|P(x_n)\|\leq \delta_0\lim q^n=0\), i.e. \(x^*\) is a solution of equation (1). By virtue of condition 2° c) this solution is unique in the sphere \(S(x_0,r)\). By the same condition,
\[
\|P(x_n)\|\,\|x_n-x^*\|
\geq |(P(x_n)-P(x^*),x_n-x^*)|
= |(P'(\bar x_n)(x_n-x^*),x_n-x^*)|
\geq M^{-1}\|x_n-x^*\|^2
\]
\[
(\bar x_n=x^*+\tau_n(x_n-x^*),\quad 0<\tau_n<1),
\]
whence (7) follows.
- If condition 2° c) of Theorem 1 is replaced by a weaker condition (cf. (2)), then we have:
Theorem 2. Let the following conditions be satisfied:
1°. \(\|P(x_0)\|=\delta_0\leq \bar\delta_0\).
2°. \(\bigl(P'(x_0)h,h\bigr)\geq M_0^{-1}\|h\|^2\) for all \(h\in H\) \((M_0>0)\).
3°. For all \(x\in S(x_0,r)\), where
\[
r=\frac1B\left(\frac1{M_0}-\frac1{M^*}\right)\frac{\delta_0}{\bar\delta_0}
\qquad (M^*=\lim M_n\leq+\infty),
\]
the estimates
\[
\|P'(x)\|\leq A,\qquad \|P''(x)\|\leq B
\]
hold.
4°. The quantities \(a_0=M_0^2B\bar\delta_0\) and \(b_0=M_0^2A^2\) are such that the sequence
\(\{a_n\}=\{M_n^2B\bar\delta_n\}\), computed by means of the recurrence relations
\[ M_{k+1}=\frac{M_k}{1-M_k^2B\bar\delta_k}, \tag{8} \]
\[ \bar\delta_{k+1}=\bar\delta_k \left(\sqrt{1-(M_k^2A^2)^{-1}}+\tfrac12 M_k^2B\bar\delta_k\right) \]
converges (so that \(a_n<1\) for all \(n\)).
\(^*\) The symbol \(S(x_0,r)\) denotes the sphere \(\|x-x_0\|\leq r\).
Then equation (1) has in the sphere \(S(x_0,r)\) a solution \(x^*\), to which the sequence \(\{x_n\}\) obtained by method (6) converges, and the estimates
\[ \|x^*-x_n\|\leq \frac{2M_n\delta_n}{1+\sqrt{1-2M_n^2B\delta_n}} <2M_n\delta_n, \tag{9} \]
hold, where \(\delta_n=\|P(x_n)\|\), and \(M_n\) are defined recursively by formulas (8). If \(M^*<\infty\) or \(\bar\delta_0>\delta_0\), then the solution is unique in the sphere \(S(x_0,r)\).
Theorem 2 is proved essentially in the same way as Theorem 3 in paper (2), taking into account relations (3) and (5). To obtain estimates (9), we use Taylor’s formula
\[ (P(x^*),h)=(P(x_n)+P'(x_n)(x^*-x_n)+{}^1\!/\!_2P''(x_n+\tau_n(x^*-x_n))(x^*-x_n),h) \]
\[
(0<\tau_n<1)
\]
in the case
\[
h=\overline{[P'(x_n)]^{-1}}(x^*-x_n).
\]
Hence we obtain the inequality
\[
{}^1\!/\!_2 M_nB\|x^*-x_n\|^2-\|x^*-x_n\|+M_n\delta_n\geq0,
\]
from which (9) follows.
Verification of the fulfillment of the conditions of Theorem 2 is facilitated by
Theorem 3. If \(a_0b_0\leq{}^1\!/\!_9\), then condition \(4^\circ\) of Theorem 2 is fulfilled.
Proof. By virtue of the recurrence relations (8) and the condition of the present theorem,
\[
a_nb_n\leq\cdots\leq a_1b_1\leq a_0b_0
\quad (b_k=M_k^2A^2).
\]
The assertion follows from this.
- We choose
\[ y_n=\overline{P'(x_n)}P(x_n), \]
where \(\overline{P'(x)}\) is the operator adjoint to the linear operator \(P'(x)\). Then we obtain the method
\[ x_{n+1}=x_n- \frac{\|\overline{P'(x_n)}P(x_n)\|^2} {\|P'(x_n)\overline{P'(x_n)}P(x_n)\|^2} \,\overline{P'(x_n)}P(x_n). \tag{10} \]
On the convergence of method (10), the following theorems are valid:
Theorem 4. Suppose the following conditions are fulfilled:
\(1^\circ.\) \(\|P(x_0)\|\leq\delta_0\).
\(2^\circ.\) For all \(x\in S(x_0,r)\), where
\[
r=\frac{M\delta_0}{1-q},
\]
the estimates hold:
a) \(\|P'(x)\|\leq A\);
b) \(\|P''(x)\|\leq B\);
c)
\[
\|P'(x)h\|\geq M^{-1}\|h\|
\]
and
\[
\|\overline{P'(x)}\,h\|\geq M^{-1}\|h\|
\]
for all \(h\in H\) \((M>0)\).
\(3^\circ.\)
\[
q=\frac{b-1}{b+1}+\frac12 a_0<1,
\]
where \(b=M^2A^2,\ a_0=M^2B\delta_0\).
Then equation (1) has in the sphere \(S(x_0,r)\) a solution \(x^*\), to which the sequence \(\{x_n\}\) obtained from (10) converges, and the estimates
\[ \|x^*-x_n\|\leq \frac{M}{1-q}\|P(x_n)\| \leq \frac{M\delta_0}{1-q}q^n \]
hold.
Theorem 5. Suppose the conditions of Theorem 2 are fulfilled, except for condition \(2^\circ\) and relations (8), which are replaced respectively by the conditions:
\[ \|P'(x_0)h\|\geq M_0^{-1}\|h\| \quad\text{and}\quad \|\overline{P'(x_0)}h\|\geq M_0^{-1}\|h\| \quad \text{for all }h\in H\ (M_0>0) \]
and by the relations
\[ M_{k+1}=\frac{M_k}{1-M_k^2B\bar\delta_k}, \qquad \bar\delta_{k+1}=\bar\delta_k\left( \frac{M_k^2A^2-1}{M_k^2A^2+1} +\frac12 M^2B\bar\delta_k \right). \tag{11} \]
Then equation (1) has in the sphere \(S(x_0,r)\) a solution \(x^*\), to which the sequence \(\{x_n\}\) obtained from (10) converges, and the estimates (9) hold, where \(\delta_n=\|P(x_n)\|\) and \(M_n\) are defined recursively by formulas (11).
The weakenings in the hypotheses of Theorems 4 and 5, as compared with the hypotheses of Theorems 1 and 2, are obtained by virtue of the self-adjointness of the operator \(P'(x)P'(x)\), since now, in order to estimate the last term in (5), we may use Theorem 2 of (3).*
Verification of the fulfillment of the conditions of Theorem 5 is facilitated by
Theorem 6. If
\[ (b_0+1)(9-12a_0+8a_0^2-2a_0^3)a_0 \leqslant 4 \quad\text{and}\quad a_0 \leqslant {^{4}\!/_{9}}, \]
then condition \(4^\circ\) of Theorem 5 is satisfied.
Finally, let us note that other choices of the elements \(y_n\) may also be of some interest. For example, with the choices
\(y_n=P'(x_n)\overline{P'(x_n)}P(x_n)\),
\(y_n=\overline{P'(x_n)}P'(x_n)\overline{P'(x_n)}P(x_n)\), etc., theorems analogous to those given above hold. In particular, if \(y_n=[P'(x_n)]^{-1}P(x_n)\), then we obtain Newton’s method.
Institute of Power Engineering
Academy of Sciences of the Estonian SSR
Received
14 VI 1960
REFERENCES
- M. A. Krasnosel’skii, S. G. Krein, Matem. sborn., 31, No. 2, 315 (1952).
- L. A. Kivistik, Izv. AN EstSSR, ser. fiz.-matem. i tekhn. nauk, 9, No. 2, 145 (1960).
- W. Greub, W. Rheinboldt, Proc. Am. Math. Soc., 10, No. 3, 407 (1959).
* In the case of a finite-dimensional space, the same estimate already follows from (4.10) in (1).