Full Text
MATHEMATICS
S. G. MIKHLIN
ON THE STABILITY OF THE RITZ METHOD
(Presented by Academician V. I. Smirnov on 23 V 1960)
In the present note we shall use, without reservations or explanations, the terminology, notation, and results of the monograph \((^1)\).
\(1^\circ\). Let the equation be given
\[ Au=f, \]
where \(A\) is a positive-definite operator acting in some Hilbert space \(H\). In approximately solving this equation by the Ritz method, one chooses a system of coordinate elements (below we shall call it a coordinate system) \(\varphi_k \in H_A,\ k=1,2,\ldots,\) subject to two conditions: 1) the coordinate elements, taken in any finite number, are linearly independent; 2) the coordinate system is complete in \(H_A\). The Ritz method gave satisfactory results as long as computers were restricted to a small number of coordinate elements. The transition to the use of a large number of coordinate elements (which is necessary for increasing the accuracy of the approximate solution) revealed a certain “instability” of the Ritz method: as the number of coordinate elements used increased, the accuracy of determining the coefficients of the approximate solution began to decrease sharply. The author, who investigated this phenomenon in the article \((^2)\), discovered a certain class of coordinate systems for which the Ritz method remains stable. The author called these systems “reliable”; M. G. Krein \((^3)\), who had studied these systems earlier and from another point of view, called them “Bari bases.”
In the present note some additional considerations concerning reliable systems will be given, and a new, considerably broader class of coordinate systems will be indicated for which the Ritz method preserves stability.
\(2^\circ\). Theorem 1. Let \(A\) and \(B\) be two positive-definite operators with a common domain of definition, and let the operator \(T=A^{-1}(B-A)\) have finite absolute norm in \(H_A\). Then a system orthonormalized in \(H_B\) is reliable in \(H_A\).
The proof is based on Heinz’s theorem \((^4)\) on inequalities between fractional powers of operators.
The conditions of Theorem 1 are satisfied if, for example: 1) \(A\) and \(B\) are positive-definite differential operators of some order \(2l\), defined on functions given in some finite domain of \(m\)-dimensional space and satisfying one and the same boundary conditions; 2) the order of growth of the Green’s functions of both operators and of the derivatives of these functions coincides with the order of growth of the fundamental solution of the equation \(\Delta^l u=0\) and of the derivatives of this solution; 3) \(B-A\) is a differential operator of order \(s,\ s<2l-m/2\).
\(3^\circ\). If the coordinate system \(\{\varphi_k\}\) is reliable in \(H_A\), then the infinite Ritz system
\[ \sum_{k=1}^{\infty} [\varphi_k,\varphi_j]_A a_k = (f,\varphi_j), \quad j=1,2,\ldots, \tag{1} \]
is an equation of the Riesz–Schauder type* in \(l_2\). It is uniquely solvable, and
\[ \sum_{k=1}^{\infty} a_k \varphi_k = A^{-1} f . \]
If \(a_1^{(n)}, a_2^{(n)}, \ldots, a_n^{(n)}\) is the solution of the Ritz system of order \(n\), and we put
\[ a=(a_1,a_2,\ldots,a_n,a_{n+1},\ldots), \qquad a^{(n)}=(a_1^{(n)},a_2^{(n)},\ldots,a_n^{(n)},0,0,\ldots), \]
then
\[ \|a^{(n)}-a\|_{l_2}\to 0. \]
\(4^\circ\). Let us pass to the consideration of a class of coordinate systems more general than the class of reliable systems.
Let the coordinate system be minimal \(({}^5,{}^6)\) in \(H_A\). In this case the limits
\[ a_k=\lim_{n\to\infty} a_k^{(n)}, \tag{2} \]
exist, where \(a_k^{(n)}\), \(k=1,2,\ldots,n\), is the solution of the Ritz system of the \(n\)-th order. The convergence to the limit in formula (2) is uniform with respect to \(k\), if the coordinate system is strongly minimal \(({}^7)\) in \(H_A\). Strong minimality means the following: the coordinate system is minimal in \(H_A\), and, if \(\varphi_k\), \(k=1,2,\ldots\), are the coordinate elements, then the least eigenvalue of the Ritz matrix of order \(n\)
\[ R_n=\|[(\varphi_k,\varphi_j)_A]\|_{j,k=1}^{j,k=n} \tag{3} \]
is bounded below by a positive constant independent of \(n\).
We note that if the system \(\{\varphi_k\}\) is strongly minimal, then the elements of the biorthogonal system to it are bounded in the aggregate in norm. A proof of this assertion is essentially contained in \(({}^8)\).
\(5^\circ\). Theorem 2. A system reliable in the space \(H_A\) is strongly minimal in this space.
Theorem 3. Let \(A\) and \(B\) be positive-definite operators and \(H_A\subset H_B\). Let \(\varphi_k\in H_A\), \(k=1,2,\ldots\). If the system \(\{\varphi_k\}\) is minimal (strongly minimal) in \(H_B\), then it is minimal (strongly minimal) also in \(H_A\).
In particular, a system \(\{\varphi_k\}\subset H_A\), orthonormalized in \(H_B\), is strongly minimal in \(H_A\). Taking \(B=I\) (\(I\) is the identity operator), we obtain from Theorem 3 that a system minimal (strongly minimal) in \(H\) is minimal (strongly minimal) also in \(H_A\). If \(\varphi_k\in H_A\) and the system \(\{\varphi_k\}\) is orthonormalized in \(H\), then it is strongly minimal in \(H_A\).
Theorem 4. Let the coordinate system be strongly minimal in \(H_A\), and let \(a_k\) be the limits (2). The sequence
\[ a=(a_1,a_2,\ldots,a_n,a_{n+1},\ldots) \]
is an element of the space \(l_2\); if we put
\[ a^{(n)}=(a_1^{(n)},a_2^{(n)},\ldots,a_n^{(n)},0\ldots), \]
then
\[ \lim_{n\to\infty}\|a-a^{(n)}\|_{l_2}=0. \]
* Thus we call an equation of the form \(x+tx=y\), where \(t\) is a completely continuous operator in the space under consideration, \(y\) is a given element and \(x\) the sought element of the same space.
Let us give the proof of Theorem 4. Let \(\lambda_1^{(n)}\) be the smallest eigenvalue of the Ritz matrix of order \(n\), \(R_n\). The coordinate system is strongly minimal in \(H_A\); therefore \(\lambda_1^{(n)} \geqslant \tilde{\lambda}\), where \(\tilde{\lambda}\) is a positive constant. If \(u_n\) is the approximate solution of the Ritz problem, and \(u_0\) its exact solution, then
\[ \|u_n\|_A^2 = \sum_{j,k=1}^{n}[\varphi_k,\varphi_j]_A a_k^{(n)}\overline{a_j^{(n)}} \geqslant \lambda_1^{(n)}\sum_{k=1}^{n}|a_k^{(n)}|^2 \geqslant \tilde{\lambda}\sum_{k=1}^{n}|a_k^{(n)}|^2, \]
and, since (see (1), p. 94) \(\|u_n\|_A \leqslant \|u_0\|_A\), it follows that
\[ \sum_{k=1}^{n}|a_k^{(n)}|^2 \leqslant \tilde{\lambda}^{-1}\|u_0\|_A^2 . \tag{4} \]
All the more,
\[ \sum_{k=1}^{p}|a_k^{(n)}|^2 \leqslant \tilde{\lambda}^{-1}\|u_0\|_A^2,\qquad p\leqslant n . \]
Putting here \(n\to\infty\), and then \(p\to\infty\), we find that
\[ \sum_{k=1}^{\infty}|a_k|^2 \leqslant \tilde{\lambda}^{-1}\|u_0\|_A^2, \tag{5} \]
and, consequently, \(a\in l_2\). Replacing \(u_0\) in inequality (5) by \(u_0-u_n\), we obtain
\[ \|a-a^{(n)}\|_{l_2}^2 \leqslant \tilde{\lambda}^{-1}\|u_0-u_n\|_A^2 \xrightarrow[n\to\infty]{}0. \tag{6} \]
Theorem 5. Let the coordinate system be strongly minimal in \(H_A\), and let the quantities \([\varphi_k,\varphi_j]_A\) and \((f,\varphi_j)\) be computed, respectively, with small errors \(\gamma_{kj}=\gamma_{jk}\) and \(\delta_j\). Let \(a^{(n)}=(a_1^{(n)},a_2^{(n)},\ldots,a_n^{(n)})\) and \(a^{(n)'}=(a_1^{(n)'},a_2^{(n)'},\ldots,a_n^{(n)'})\) be, respectively, the solutions of the exact and approximate Ritz systems. Then
\[ \bigl\|a^{(n)'}-a^{(n)}\bigr\| \leqslant \frac{\tilde{\lambda}^{-3/2}\gamma\|u_0\|_A+\tilde{\lambda}^{-1}\delta} {1-\tilde{\lambda}^{-1}\gamma}, \]
where
\[ \gamma^2=\sum_{j,k=1}^{n}|\gamma_{jk}|^2,\qquad \delta^2=\sum_{j=1}^{n}|\delta_j|^2 . \]
Theorem 5 may be interpreted as a theorem on the stability of the Ritz method under the condition of strong minimality of the coordinate system, since estimate (6) does not depend on \(n\).
Let us also note that if the coordinate system is strongly minimal in \(H_A\), then
\[ \sum_{j,k=1}^{n}|\delta_{kj}^{(n)}|^2 \leqslant \tilde{\lambda}^{-2}, \tag{7} \]
where \(\delta_{kj}^{(n)}\) are the elements of the matrix \(R_n^{-1}\).
From the theorems given above it follows that, in addition to conditions 1) and 2) mentioned at the beginning of this note, it is expedient to impose on the coordinate system the following further additional condition: 3) the coordinate system is strongly minimal in \(H_A\). We remark that condition 1) is a consequence of condition 3). This latter condition is, of course, not necessary if one intends to construct a crude approximation using a small number of coordinate elements.
Example. In solving the first boundary-value problem for an ordinary nondegenerate differential equation of the second order, one often uses the coordinate system \(\varphi_k(x)=x^k(1-x)\). From the well-known theorem of G. Müntz (see, for example, \((^6)\)), it follows that this system is nonminimal in \(H=L_2(0,1)\). We shall show that it is not strongly minimal in the corresponding space \(H_A\); for this it suffices to show that it does not satisfy an inequality of the form (7). It is also sufficient to restrict ourselves to the simplest case, when
\[ Au=-\frac{d^2u}{dx^2}, \qquad 0\le x\le 1, \qquad u(0)=u(1)=0. \]
In this case
\[ [\varphi_k,\varphi_j]_A=-\frac{[2jk}{(j+k)[(j+k)^2-1]}. \]
The elements \(\sigma_{kj}^{(n)}\) of the matrix \(R_n^{-1}\) satisfy the relation
\[ \sum_{k=1}^{n}\frac{2jk\,\sigma_{kj}^{(n)}}{(j+k)[(j+k)^2-1]}=1. \]
Hence
\[ \sum_{k=1}^{n}\left|\sigma_{kj}^{(n)}\right|^2\ge cn;\qquad c=\mathrm{const}>0, \]
and an inequality of the form (7) does not hold.
Leningrad State University
named after A. A. Zhdanov
Received
17 V 1960
References
\(^1\) S. G. Mikhlin, Variational Methods in Mathematical Physics, 1957.
\(^2\) S. G. Mikhlin, Izv. Vyssh. Uchebn. Zaved., Matematika, No. 5, 91 (1958).
\(^3\) M. G. Krein, UMN, 12, issue 3 (75), 333 (1957).
\(^4\) E. Heinz, Math. Ann., 123, Heft 4, 415 (1951).
\(^5\) S. Lewin, Math. Zs., 32, Heft 4, 491 (1930).
\(^6\) S. Kaczmarz, G. Steinhaus, Theory of Orthogonal Series, IL, 1958.
\(^7\) A. T. Taldykin, Matem. sborn., 29 (71), No. 1, 79 (1951).
\(^8\) A. T. Taldykin, DAN, 26, No. 4, 540 (1940).