MATHEMATICS
M. N. YAKOVLEV
Submitted 1964-01-01 | RussiaRxiv: ru-196401.18199 | Translated from Russian

Full Text

MATHEMATICS

M. N. YAKOVLEV

ON THE SOLUTION OF NONLINEAR EQUATIONS BY THE METHOD OF ITERATIONS

(Presented by Academician V. I. Smirnov on 22 I 1964)

Let in a real Hilbert space \(H\) there be given a twice continuously differentiable functional \(f(x)\) and let \(F(x)\) be the gradient of the functional \(f(x)\), i.e., the operator defined by the formula

\[ \lim_{t\to 0}\frac{f(x+th)-f(x)}{t}=(F(x),h) \]

for arbitrary \(h\in H\) (see, for example, (1)).

In this note we consider iterative methods for solving the equation

\[ F(x)=0. \tag{1} \]

1. Theorem 1. Let there exist \(x_0\in H\) and a number \(r>0\) such that:

a) \((F'(x)h,h)\geq m(h,h)\), \(m>0\) for \(h\in H\) and \(x\in S(x_0,r)\);

b) \(r\geq \|F(x_0)\|/m\);

c) \((F'(x)h,h)\leq M(h,h)\) for \(h\in H\) and \(x\in S(x_0,\|F(x_0)\|/m)\).

Then in the sphere \(S(x_0,\|F(x_0)\|/m)\) there exists a solution \(x^*\) of equation (1), unique in the sphere \(S(x_0,r)\), and the iterative process starting from \(x_0\) converges to it:

\[ x_{n+1}=x_n-\frac{2}{M+m}F(x_n)\quad (n=0,1,2,\ldots), \]

moreover

\[ \|x_n-x^*\|\leq \left(\frac{M-m}{M+m}\right)^n\|x_0-x^*\|. \]

Remark. Under the conditions of Theorem 1, the solution \(x^*\) of the equation \(F(x)=0\), unique in \(S(x_0,r)\), is also the unique point of minimum of the functional \(f(x)\) in \(S(x_0,r)\).

In the case when the computations are carried out with errors, the following holds:

Theorem 2. Let there exist \(x_0\in H\) and numbers \(r>0\) and \(\bar h>0\) such that:

a) \((F'(x)h,h)\geq m(h,h)\), \(m>0\) for \(h\in H\) and \(x\in S(x_0,r)\);

b)

\[ r\geq \frac{\|F(x_0)\|+\bar h}{m}; \]

c) \((F'(x)h,h)\leq M_1(h,h)\) for \(h\in H\) and

\[ x\in S\left(x_0,\frac{1}{m}\bigl[\|F(x_0)\|+\bar h\bigr]\right). \]

Then for any sequence \(h_n\in H\) with \(\|h_n\|\leq \bar h\), the iterative process starting from \(x_0\),

\[ x_{n+1}=x_n-\frac{2}{M_1+m}\,[F(x_n)+h_n]\quad (n=0,1,2,\ldots) \]

gives a sequence \(x_n\) such that

\[ \|x_n-x^*\|\leq \left(\frac{M_1-m}{M_1+m}\right)^n\|x_0-x^*\|+\frac{\bar h}{m}. \]

  1. Theorem 3. Suppose

\[ (F'(x)h,h)\geq m(h,h),\qquad m>0,\quad x,h\in H; \]

\(A_n\) is a sequence of symmetric, bounded, positive definite operators such that

\[ \overline m(A_n^{-1}h,h)\leq (F'(x)h,h)\leq \overline M(h,h), \]

\[ 0<\overline m\leq \overline M<\infty \quad\text{for } h\in H \text{ and } x\in S\left(x^*,\sqrt{\frac{2}{m}\,[f(x_0)-f(x^*)]}\right), \]

where \(x_0\) is some element of \(H\).
Then the function \(\psi(\alpha)=f(x_n-\alpha A_nF(x_n))\) has a unique minimum \(\alpha_n\), and the iterative process, starting from \(x_0\),

\[ x_{n+1}=x_n-\gamma_n A_nF(x_n)\qquad (n=0,1,2,\ldots), \]

where \(\gamma_n\) are arbitrary numbers satisfying the condition
\(\varepsilon\leq\gamma_n\leq\alpha_n,\ \varepsilon>0\), gives a minimizing sequence for \(f(x)\), with \(f(x_n)\to f(x^*)\) and \(x_n\to x^*\) at the rate of a geometric progression with ratios respectively \(\tau\) and \(\sqrt{\tau}\), where

\[ \tau=1-\varepsilon_1\left(1-\frac{\varepsilon_1\overline M}{2}\right)\frac{2\overline m^2}{M}, \qquad \varepsilon_1=\min\left\{\varepsilon,\frac{1}{\overline M}\right\}. \]

In particular, the iterative process

\[ x_{n+1}=x_n-\alpha_n A_nF(x_n)\qquad (n=0,1,2,\ldots) \]

converges, starting from any \(x_0\), at the rate of a geometric progression with ratio

\[ \sqrt{1-m^2/M^2}. \]

  1. Corollary 1. Suppose

\[ (F'(x)h,h)\geq m(h,h),\qquad m>0,\quad x,h\in H \]

and \(A_n\) is a sequence of symmetric, bounded, positive definite operators, with

\[ m_A(h,h)\leq (A_nh,h)\leq M_A(h,h),\qquad h\in H,\quad 0<m_A\leq M_A<+\infty. \]

Then the function \(\psi(\alpha)=f(x_n-\alpha A_nF(x_n))\) has a unique minimum \(\alpha_n\), and the iterative process

\[ x_{n+1}=x_n-\gamma_n A_nF(x_n)\qquad (n=0,1,2,\ldots), \]

where \(\varepsilon\leq\gamma_n\leq\alpha_n,\ \varepsilon>0\), starting from any \(x_0\), gives a minimizing sequence for \(f(x)\), with \(f(x_n)\to f(x^*)\) and \(x_n\to x^*\) at the rate of a geometric progression.

Corollary 2. Suppose

\[ (F'(x)h,h)\geq m(h,h),\qquad m>0,\quad x,h\in H. \]

Then the iterative process

\[ x_{n+1}=x_n-\gamma_n [F'(x_n)]^{-1}F(x_n)\qquad (n=0,1,2,\ldots) \]

with \(\varepsilon\leq\gamma_n\leq\alpha_n,\ \varepsilon>0\), where \(\alpha_n\) is the unique minimum of the function
\(f(x_n-\alpha [F'(x_n)]^{-1}F(x_n))\), gives a minimizing sequence for \(f(x)\), with \(f(x_n)\to f(x^*)\) and \(x_n\to x^*\) at the rate of a geometric progression.

Theorem 4. Suppose the conditions of Theorem 3 are fulfilled and

\[ \|A_n-[F'(x^*)]^{-1}\|\to 0\qquad \text{as } n\to\infty. \]

Then the iterative process

\[ x_{n+1}=x_n-\alpha_n A_nF(x_n)\qquad (n=0,1,2,\ldots) \]

converges, starting from any \(x_0\), to \(x^*\) at a rate higher than a geometric progression, i.e.,
\[ \|x_n-x^*\|\leq Cq_1q_2\ldots q_n,\qquad q_k<1,\qquad q_k\to 0\quad \text{as } k\to\infty . \]

Corollary 3. Let the condition
\[ (F'(x)h,h)\geq m(h,h),\qquad m>0,\qquad x,h\in H \]
be satisfied. Then the iterative process
\[ x_{n+1}=x_n-\alpha_n [F'(x_n)]^{-1}F(x_n)\qquad (n=0,1,2,\ldots), \]
where \(\alpha_n\) is determined from the condition that the function
\[ \psi(\alpha)=f\bigl(x_n-\alpha [F'(x_n)]^{-1}F(x_n)\bigr) \]
attain its minimum, converges to \(x^*\) at a rate greater than that of a geometric progression.

The equation considered in this paper is not the general equation \(\Phi(x)=0\), where \(\Phi(x)\) is an operator from \(\mathcal H\) into itself. The proofs of the assertions indicated above essentially use the fact that \(F(x)\) is the gradient of a functional. In particular, it is essential that \(F'(x)\) be a symmetric operator. In the finite-dimensional case of \(n\) dimensions, the problem considered is the problem of finding a stationary point.

The iterative process
\[ x_{n+1}=x_n-\alpha_n [F'(x_n)]^{-1}F(x_n)\qquad (n=0,1,2,\ldots), \]
unlike Newton’s method, which for its convergence requires an initial approximation close to the solution \(x^*\), converges from any initial approximation.

Unlike the method of steepest descent, the rate of convergence increases as the solution is approached.

Received 17 I 1964

CITED LITERATURE

  1. M. M. Vainberg, Variational Methods for the Study of Nonlinear Operators, Moscow, 1956.

Submission history

MATHEMATICS