UDC 518:517.948
MATHEMATICS
Submitted 1970-01-01 | RussiaRxiv: ru-197001.29040 | Translated from Russian

Full Text

UDC 518:517.948

MATHEMATICS

Ya. I. Alber

SOLUTION OF NONLINEAR OPERATOR EQUATIONS BY METHODS OF THE STEEPEST-DESCENT TYPE

(Presented by Academician L. V. Kantorovich, 22 IX 1970)

In a domain \(\Omega\) of a (real) Hilbert space \(H\), consider the equation

\[ F(x)=0 \tag{1} \]

with an operator \(F:\Omega\to H\) twice continuously differentiable (in the Fréchet sense). Let \(D_x=F'(x)\); \(D_x^*\) is the operator adjoint to \(D_x\); \(P(x)=\operatorname{grad}\|F(x)\|^2=2D_x^*F(x)\).

Beginning with the works of L. V. Kantorovich \((^{1,2})\), methods of the steepest-descent type for solving linear and nonlinear equations in functional spaces have been the subject of numerous investigations (see the bibliography in \((^{3,4})\)). It should be noted, however, that in doing so one usually used a condition of the form

\[ (D_xh,h)\geq \gamma(h,h),\qquad x\in\Omega,\quad h\in H,\quad \gamma>0, \tag{2} \]

which guarantees existence of a solution, its uniqueness, simplicity, and nondegeneracy.

In the present paper, restrictions weaker than (2) are imposed, which has made it possible to substantially broaden the class of problems to which the indicated methods can be applied (cf. \((^{5-7})\)).

First of all, note that the operator \(D_x\) is not assumed, in the general case, to be positive or self-adjoint, nor is the operator \(F\) assumed to be potential.

  1. To solve problem (1), consider the trajectories of the differential equation

\[ \frac{dx}{dt}=-\frac{2\|F(x)\|^2}{(\operatorname{grad}\|F(x)\|^2,F(x))}F(x). \tag{3} \]

Theorem 1. For any solution \(x(t)\) of equation (3), the identity

\[ \|F(x(\tau))\|=\|F(x(t))\|e^{-(\tau-t)},\qquad \tau\geq t. \tag{4} \]

holds.

Theorem 2. Suppose that at all points of the sphere \(S_{\rho,x^0}:\|x-x^0\|\leq \rho\) belonging to \(\Omega\), the inequality

\[ \left|(\operatorname{grad}\|F(x)\|^2,F(x))\right|^m\geq c\|F(x)\|^2,\qquad m>2/3,\quad c=\operatorname{const}>0, \tag{5} \]

is satisfied, and, moreover,

\[ \|F(x^0)\|\leq \varkappa\rho^{1/\beta},\qquad \varkappa=(\tfrac12\gamma\beta c^{1/m})^{1/\beta},\qquad \beta=(3m-2)/m, \]

where \(\gamma\) is an arbitrary constant, \(0<\gamma<1\). Then:

1) equation (1) has in \(S_{\rho,x^0}\) a solution \(\xi\) (possibly nonunique);

2) the differential equation (3) has a solution \(x(t)\in S_{\rho,x^0}\) for \(t\geq0\), for which \(x(0)=x^0\);

3) \(\displaystyle \lim_{t\to\infty}x(t)=\xi\);

4) \(\|F(x(t))\|=\|F(x^0)\|e^{-t}\);

5) \(\|x(t)-\xi\|\leq \gamma \rho e^{-\beta t}\).
\[ \tag{6} \]

  1. Definition 1. A solution \(\xi\) of equation (1) will be called a strictly nondegenerate simple solution if the condition
    \[ \left|(D_\xi h,h)\right|\geq \mu\|h\|^2,\qquad \mu>0,\ h\in H \]
    is satisfied.

Theorem 3. If \(\xi\) is a strictly nondegenerate simple solution of the equation \(F(x)=0\), then there exists a neighborhood \(S_{\rho,\xi}\) of the solution in which all trajectories (3) stabilize to \(\xi\), and estimate (6) holds with exponent \(\beta=m=1\).

  1. Consider the case of multiple solutions. Let \(F\in C^{k+1}\). Then for \(F(x)\), in a neighborhood of the solution \(\xi\), there exists an expansion
    \[ F(x)=D_\xi^0+D_\xi^1+\ldots+D_\xi^{k-1} +\left(\int_0^1 \frac{(1-t)^{k-1}}{(k-1)!}\,F^{(k)}(\xi+t\zeta)\,dt\right)\zeta^{(k)}, \tag{7} \]
    where \(\zeta=x-\xi\in\Omega\), \(D_\xi^n=\dfrac{1}{(n-1)!}F^{(n)}(\xi)\zeta^{(n)}\) is the \(n\)-th strong differential of \(F(x)\) \((^{11})\), and in the remainder term the operator is continuously differentiable with respect to \(x\).

Definition 2. A solution \(\xi\) of equation (1) will be called a strictly nondegenerate multiple solution of multiplicity \(k\) if, in the expansion (7), \(D_\xi^n=0\), \((n=0,1,\ldots,k-1)\), \(\|D_\xi^k\|\geq \alpha\|\zeta\|^k\), \(\left|([D_\xi^k]'h,h)\right|\geq \beta\|\zeta\|^{k-1}\|h\|^2\), \(\alpha,\beta=\mathrm{const}>0\), \(h\in H\). For \(k=1\) this definition coincides with the preceding one.

Theorem 4. Let \(\xi\) be a strictly nondegenerate multiple solution of multiplicity \(k\).

Then there exists a circular neighborhood \(S_{\rho,\xi}\) of the solution in which all trajectories (3) stabilize to \(\xi\), and estimate (6) holds with exponent \(\beta=1/k\).

The proof follows from Theorem 2; here \(m=2k/(3k-1)\). If \(k=1\), then \(m=1\). If \(k\to\infty\), then \(m\to 2/3\), i.e. \(2/3<m\leq 1\) (see condition (5) and Theorem 3).

  1. Suppose now that the equation \(F(x)=0\) has a smooth, class \(C^2\), manifold of solutions \(U^s\), \(s=\dim U^s\), and, for simplicity, assume that \(D_\xi\) is a positive (negative) operator.

Definition 3. The manifold of solutions \(U^s\) will be called nondegenerate if, at each of its points \(\xi\), \(\lambda=0\) is an isolated eigenvalue of finite multiplicity \(s\) of the operator \(D_\xi\).

In (7) it was shown that if \(U^s\) is a compact manifold, then its \(\varepsilon\)-neighborhood is a normal tubular neighborhood, i.e. a bundle \(\mathscr{B}_\rho=(B_\rho,U^s,\rho,K_\rho,O_\infty)\) with an orthogonal transformation group. Similarly to \((^{6,7})\), in the local neighborhood \(B_{\rho,\xi_0}\) of an arbitrary point \(\xi_0\in U^s\), estimate \(\left|(D_x^*F(x),F(x))\right|\) in normal coordinates \((\xi)\in U_{\rho,\xi_1}\subset U\), \((\eta)\in K_\rho\). We obtain the theorem.

Theorem 5. Let the equation \(F(x)=0\) have a nondegenerate smooth, class \(C^2\), manifold of solutions \(U^s\), and let \(\xi_0\in U^s\).

Then there exists a normal tubular neighborhood of the solution \(\xi_0\), in which all trajectories of the differential equation (3) stabilize to the manifold of solutions, and estimate (6) holds with exponent \(\beta=m=1\).

  1. For an approximate solution of equation (3) we apply the Euler scheme
    \[ x^{n+1}=x^n-\tau\,\frac{\|F(x^n)\|}{(\operatorname{grad}\|F(x^n)\|^2,F(x^n))}\,F(x^n). \tag{8} \]

Theorem 6. If at all points of the sphere \(S_{\rho,x^0}\) the conditions

\[ \bigl|(\operatorname{grad}\|F(x)\|^2,F(x))\bigr| \ge c\|F(x)\|^2,\qquad c>0; \tag{9} \]

\[ (P'(x)h,h)\le K\|h\|^2,\qquad h\in H, \tag{10} \]

are satisfied, and, moreover,

\[ \|F(x^0)\|\le \gamma\rho\,\frac{c}{\tau}(1-\sqrt{q}),\quad 0<\gamma<1,\quad \widetilde q=\frac{\tau K}{2c^2}<1,\quad c^2<2K, \]

\[ q=1-\tau+\frac{k}{2c^2}\tau^2, \]

then equation (1) has at least one solution \(\xi\), and the iterative process (8) converges monotonically and strongly to \(\xi\) with the rate

\[ \|x^n-\xi\|\le \gamma\rho q^{n/2}. \]

Proof.

\[ \|F(x^{n+1})\|^2\le \left(1-\tau+\frac{K}{2c^2}\tau^2\right)\|F(x^n)\|^2 =q(\tau)\|F(x^n)\|^2, \tag{11} \]

\[ 0<q(\tau)<1\qquad \text{for }0<\tau<2c^2/K>4. \]

Further,

\[ \|x^{p+m}-x^m\|\le \frac{\tau\|F(x^0)\|}{c}\, \frac{q^{m/2}(1-q^{p/2})}{1-\sqrt q} \le \gamma\rho q^{m/2}. \tag{12} \]

The assertion of the theorem now follows from (11), (12).

By choosing \(\tau\) in a suitable way, one can obtain convergence for arbitrary \(K\) and \(c\). In particular, for an unbounded operator \(P'\) it is necessary to consider the continuous analogue (3) of the iterative process (8). If \(K<2c^2\), then in (8) one may put \(\tau=1\), and then \(\|F(x^n)\|^2\le (K/2c^2)^n\|F(x^0)\|^2\). If, moreover, \(K<c^2/2\), then \(0<q(\tau)<1\) when \(\tau\in(0,\tau_1)\) and \(\tau\in(\tau_2,2c^2/k)\), where \(\tau_1<\tau_2\) are the positive roots of the equation \(q(\tau)=0\), and always \(\tau_1\ge 1\).

Remark 1. For the linear equation \(Ax=f\) with bounded self-adjoint operator \(A\), method (8) for \(\tau=2\) gives the formula of steepest descent of L. V. Kantorovich

\[ x^{n+1}=x^n-\frac{(Ax-f,Ax-f)}{(A(Ax-f),Ax-f)}(Ax-f). \]

In \(({}^1,{}^2)\) convergence of this process is proved for a positive definite operator \(A\).

  1. Let us now consider the general method

\[ dx/dt=-\alpha(x)F(x)\operatorname{sign} g(x), \tag{13} \]

where \(g(x)=(P(x),F(x))\).

Analogously to Theorem 2, one proves

Theorem 7. Let the operator \(F(x)\in C^1\) and satisfy in the sphere \(S_{\rho,x^0}\) condition (9),

\[ \rho>\frac{M}{p\gamma}\|F(x^0)\|,\qquad M>0,\qquad p=\frac{mc}{2},\qquad 0<\gamma<1. \]

Let

\[ m\le \alpha(x)\le M. \]

Then:

1) equation (1) has in \(S_{\rho,x^0}\) a solution \(\xi\) (possibly nonunique);

2) the differential equation (13) has a solution \(x(t)\) on the half-infinite interval \([0,\infty)\), for which \(x(0)=x^0\);

3) \(\displaystyle \lim_{t\to\infty}x(t)=\xi\);

4) \(\|F(x(t))\|\le \|F(x^0)\|e^{-pt}\);

5) \(\|x(t)-\xi\|\le \gamma\rho e^{-pt}\).

Remark 2. A special case of this theorem—the convergence of the trajectories \(dx/dt=-F(x)\) under condition (2)—is considered in \(({}^9)\).

With respect to the difference approximation of equation (13)

\[ x^{n+1}=x^n-\alpha_n F(x^n)\operatorname{sign} g(x^n) \tag{14} \]

the following is valid.

Theorem 8. If at all points of the sphere \(S_{\rho,x^0}\) the inequalities (9), (10) are satisfied,

\[ \|F(x^0)\|\leq \gamma\rho K(1-\sqrt q)/2c,\qquad 0<\gamma<1,\qquad q=1-K\varepsilon_1\varepsilon_2/2, \]

then the equation \(F(x)=0\) has in \(S_{\rho,x^0}\) at least one solution \(\xi\), and the iterative process (14), where

\[ \varepsilon_1\leq \alpha(x^n)\leq 2c/K-\varepsilon_2,\qquad \varepsilon_1\varepsilon_2>0,\qquad c^2<2K, \]

converges monotonically and strongly to \(\xi\); moreover,

\[ \|x^n-\xi\|\leq \gamma\rho q^{n/2},\qquad 0<q<1. \tag{15} \]

For \(\alpha_n=\alpha=\mathrm{const}\), (14) gives the method of simple iteration. In this case

\[ \|F(x^{n+1})\|^2\leq S(\alpha)\|F(x^n)\|^2,\qquad S(\alpha)=\tfrac12 K\alpha^2-c\alpha+1. \]

It is easy to see that the quadratic trinomial \(S(\alpha)\) assumes its minimal value, equal to \(1-c^2/2K\), at \(\alpha=c/K\), is symmetric with respect to \(\alpha=c/K\), and is equal to one at \(\alpha=0\) and \(\alpha=2c/K\). Hence it follows that the sequence

\[ x^{n+1}=x^n-\frac{c}{K}F(x^n)\operatorname{sign} g(x^n) \]

converges to \(\xi\) with rate (15), where \(q=1-c^2/2K\).

If \(\alpha(x)\) and \(g(x)\) change sign simultaneously, then in expression (14) \(\operatorname{sign} g(x^n)\) must be omitted. Along with (3) and (8), the method with minimal residuals possesses this property:

\[ x^{n+1}=x^n- \frac{2\bigl(\operatorname{grad}\|F(x^n)\|^2,F(x^n)\bigr)} {\|\operatorname{grad}\|F(x^n)\|^2\|^2}\,F(x^n), \tag{16} \]

previously investigated under assumption (2) for linear and nonlinear equations. Theorems 7 and 8 give new convergence conditions for this method and for its continuous variant.

  1. All theorems of the present paper carry over to complex Hilbert spaces. In this case inequality (5), for example, should be replaced by

\[ \left|\operatorname{Re}\bigl(\operatorname{grad}\|F(x)\|^2,F(x)\bigr)\right|^m \geq c\|F(x)\|^2. \]

Verification of the assumptions of the theorems is simplified in an obvious way if it is known that the operator \(D_x\) is sign-constant, i.e. \(\operatorname{Re}(D_xh,h)\geq 0\) \((\leq 0)\).

Scientific Research Radiophysics Institute
at N. I. Lobachevsky Gorky University

Received
10 IX 1969

REFERENCES

  1. L. V. Kantorovich, DAN, 48, No. 7, 483 (1945).
  2. L. V. Kantorovich, UMN, 3, 6, 89 (1948).
  3. M. M. Vainberg, Sibirsk. matem. zhurn., 2, No. 2, 201 (1961).
  4. M. N. Yakovlev, Tr. Matem. inst. im. V. A. Steklova AN SSSR, 84, 8 (1965).
  5. B. E. Poyak, Zhurn. vychislit. matem. i matem. fiz., 3, No. 4, 643 (1963).
  6. S. I. Alber, Ya. I. Alber, DAN, 171, No. 6, 1247 (1966).
  7. Ya. I. Alber, Zhurn. vychislit. matem. i matem. fiz., 9, No. 1, 42 (1969).
  8. I. Templ, Proc. Roy. Soc., 169, 476 (1939).
  9. I. Peterson, Izv. AN EstSSR, ser. fiz.-matem. i tekhn. nauk, 12, No. 2, 123 (1963).
  10. A. M. Ostrowski, Arch. Rational Mech. and Analysis, 26, No. 4, 257 (1967).
  11. L. A. Lyusternik, V. I. Sobolev, Elements of Functional Analysis, “Nauka,” 1965.

Submission history

UDC 518:517.948