Full Text
S. N. SLUGIN
THE METHOD OF STEEPEST DESCENT IN A HILBERT MODULE OVER A FINITE-DIMENSIONAL COMPLEX \(K\)-SPACE
(Presented by Academician S. L. Sobolev on 16 IV 1963)
- The set \(V\) of all \(m\)-dimensional vectors \(v=\{\lambda_n\}=\sum_{n=1}^{m}\lambda_n e_n\) with complex coordinates is naturally converted into a \(K^2\)-space \((^1)\), where by convergence in the \(K\)-space \((^2)\) \(Z\) of all \(m\)-dimensional vectors with real coordinates one means \((o)\)-convergence, coinciding with coordinatewise convergence; the unit \(1=\{1\}=\sum_{n=1}^{m} e_n\). For \(v=\{\lambda_n\}\), \(w=\{\mu_n\}\) the product \(vw=\{\lambda_n\mu_n\}\), and the quotient \(v/w=\{\nu_n\}\), where \(\nu_n=\lambda_n/\mu_n\) for \(\mu_n\ne 0\) and \(\nu_n=0\) for \(\mu_n=0\). The functional \(J\), defined by the equality
\[ J\left(\sum_{n=1}^{m}\lambda_n e_n\right)=\sum_{n=1}^{m}\lambda_n, \]
is \((o)\)-linear and strictly positive in \(Z\).
- Let \(X\) be an \(M_V\)-module \((^{1,3})\). Since in \(X\), by definition, products \(vx\) always exist for \(|v|\le \lambda 1\), and \(Z\) is an extended \(K\)-space of bounded elements, \(X\) is an absolute \(M_V\)-module \((^{1,3})\). Let \(X\) simultaneously be a Banach space, and suppose that for all \(n\) the inequalities
\[ \|xe_n\|\le \|x\|. \tag{1} \]
hold. Define the structural norm
\[ |x|=\sum_{n=1}^{m}\|xe_n\|e_n. \]
Then the inequalities
\[ |x|\le \|x\|1,\qquad \|x\|\le J|x|, \]
hold, whence follows the equivalence of boundedness of a set (or convergence of a direction \((^4)\)) in \(X\) in the sense of a Banach space and boundedness of a set (or, respectively, convergence of a direction) in the structural norm. From the coincidence of the meanings of convergence of directions follows the fact that \(X\) is an absolute module of type \(B_{K^2} (^{1,3})\). We shall call \(X\) a Banach module over \(V\).
- If \(X\) is an \(M_V\)-module and a Hilbert space (complex) with scalar product \([x,y]\), and if for all \(n\) the equalities
\[ [x,ye_n]=[e_nx,ye_n], \]
hold, then, introducing the structural product (previously called in \((^{1,3})\) “scalar”)
\[ (x,y)=\sum_{n=1}^{m}[x,ye_n]\,e_n, \]
we embed \(X\) in an absolute module of type \(H_{K^2}\) \(({}^1,{}^3)\) and call it a Hilbert module over \(V\). The equalities hold
\[ [x,y]=J(x,y),\qquad [e_nx,y]=[x,ye_n]. \]
Since \(|xe_n|\ll |x|\), \(\|y\|^2=J(|y|^2)\), the inequalities (1) hold, and the Hilbert module is a Banach module.
In what follows, depending on the meaning of the norm or product, we shall also denote \(X\) by one of the symbols \(B, B_V\) or \(H, H_V\).
If \(x\) is disjoint from \(y\) in \(H_V\), i.e. \(|x|\,|y|\) in \(Z\), then \(({}^3)\) \(x\perp y\) in \(H_V\), and, consequently, \(x\perp y\) in \(H\).
- Let \(U\) be an operation taking \(X=B_V\) into the Banach module \(Y\) over \(V\). Introduce the \(kn\)-section of the operation \(U\):
\[ U_{kn}(x)=e_kU(e_nx) \]
and the “projector with renumbering” in the space \(V\):
\[ \Pi_{kn}\left(\sum_i \lambda_i e_i\right)=\lambda_n e_k. \]
A linear transformation \(A\sim (u_{kn})\) in the space \(V\) can be written as
\[ A=\sum_{kn} u_{kn}\Pi_{kn}. \]
Let \(U\) be a linear (bounded) operation taking \(X=B\) into the given Banach space \(Y\). Then all sections \(U_{kn}\) are also linear (bounded) in the same sense,
\[ u_{kn}=\|U_{kn}\|\leq \|U\|, \]
and the operation \(U\) is regular, taking \(X=B_V\) into the Banach module \(Y\), with \((o)\)-linear majorant \(A\):
\[ U\ll A=\sum_{kn}u_{kn}\Pi_{kn}. \]
The estimate for \(U\) is preserved also in the case when \(Y\) is a Banach module over a finite-dimensional \(K^2\)-space of another dimension.
- An operation \(U\) for which
\[ U(vx)=vU(x) \]
for all \(x\) in the domain of definition of the given operation and all \(v\in V\), will be called \((v)\)-homogeneous. For its sections the equalities hold
\[ U_{kn}(x)=\delta_{kn}e_kU(x)=\delta_{kn}U(e_kx),\qquad \delta_{kn}=0\ (k\ne n),\quad \delta_{nn}=1. \]
In particular, for the identity operation \(I\),
\[ I_{kn}x=\delta_{kn}e_kx. \]
If \(U\) is invertible and \((v)\)-homogeneous, then \(U^{-1}\) is also \((v)\)-homogeneous. If the operation \(U\) is linear (bounded) in \(B\) and \((v)\)-homogeneous, then a majorant for \(U\) is the operation of multiplication by an element of \(Z\):
\[ |U|(z)\leq u\cdot z,\qquad u=\sum_{n=1}^{m}u_{nn}e_n\in Z. \]
-
If \(X=Y=H_V\), the operator \(U\) is self-adjoint in \(H\), then \(U_{kn}^{*}=U_{nk}\) in \(H\), and the matrix \((b_{kn})\), where \(b_{kn}=[U_{kn}x,x]\), is self-adjoint for all \(x\). Construct the matrix \((c_{kn})\) by deleting from the matrix \((b_{kn})\) all rows and columns with those numbers \(l_i\) for which the projections \(xe_{l_i}=0\) and, consequently, \(b_{l_i n}=b_{k l_i}=0\) for all \(k,n\). If \(U\) is positive definite in \(H\), then \((c_{kn})\) corresponds to a positive definite quadratic form and \(\Delta=\det(c_{kn})\ne 0\), since \(\Delta\) is the Gram determinant of the system of elements \(Ce_nx\) \((x\ne 0,\ n\ne l_i)\), where \(C=\sqrt{U}\) in \(H\); the system \(\{Ce_nx\}\) \((n\ne l_i)\) is linearly independent by virtue of the invertibility of the operator \(C\) and the pairwise disjointness in \(H_V\) of the elements \(e_nx\ne 0\) for \(n\ne l_i\).
-
If \(U=U^*\) in \(H\) and is \((v)\)-homogeneous, then \(U=U^*\) also in \(H_V\), the matrix \((b_{kn})\) becomes diagonal,
\[ b_{nn}=[Ux,xe_n],\qquad (Ux,x)=\sum_n b_{nn}e_n\in Z. \]
- Let, in the equation \(Ux=f\), the operator \(U\) be bounded and positive definite in \(H\). We shall construct a minimizing sequence \(\{x_p\}\) for the quadratic functional \(F(x)=[Ux,x]-2\operatorname{Re}[x,f]\) in the form
\[ x_{p+1}=x_p-v_p y_p,\qquad y_p=Ux_p-f, \]
where \(v_p=\{\varepsilon_{np}\}_n\). The expression \(F(x_0-vy_0)-F(x_0)\) is a function of the coordinates \(\lambda_j\) \((j\ne l_i)\) of the vector \(v\):
\[ \Phi=\sum_{kn} c_{kn}\lambda_n\overline{\lambda}_k-\sum_k d_k\overline{\lambda}_k-\sum_k \overline{d}_k\lambda_k, \]
where \(k,n\ne l_i,\ d_k=[y_0,y_0e_k],\ c_{kn}=[Ue_ny_0,y_0e_k]\). The point of minimum is determined by the equations
\[ \frac{\partial \Phi}{\partial \lambda_j}\equiv \sum_n b_{jn}\lambda_n-d_j=0\qquad (j,n\ne l_i), \]
the system is solvable. Thus, an algorithm of the method of steepest descent in the Hilbert module over \(V\) is constructed:
\[ x_{p+1}=x_p-\sum_{n=1}^{m}\varepsilon_{np}e_ny_p,\qquad y_p=Ux_p-f\qquad (p=0,1,\ldots); \tag{2} \]
the coefficients \(\varepsilon_{np}\) are found from the system of equations:
\[ \varepsilon_{l_i p}=0\quad \text{when } y_p e_{l_i}=0 \]
(if such \(l_i\) exist),
\[ \sum_n [Ue_ny_p,y_pe_k]\varepsilon_{np}=[y_p,y_pe_k]\qquad (k,n\ne l_i). \tag{3} \]
Since any product of the form \(\varepsilon y\equiv e1y\), where \(\varepsilon\) is a number, is one of the products of the form \(vy\), where \(v\in V\), it follows that \(F(x_1)\leqslant F(\bar x_0-\varepsilon y_0)\), and the first approximation \(x_1\), obtained by formula (2), is energetically \((^5)\) closer to the solution \(x^*\) of the equation \(Ux=f\):
\[ \|C(x_1-x^*)\|\leqslant \|C(\bar x_1-x^*)\|\qquad (C=\sqrt{U}) \]
in comparison with the first approximation \(\bar x_1=x_0-\varepsilon_0y_0\) (where \([Uy_0,y_0]\varepsilon_0=[y_0,y_0]\)), which is obtained by the method of steepest descent \((^4)\) in the Hilbert space \(H\). Using this (cf. \((^4)\), Ch. 15, § 1), we establish that \(x_p\to x^*\),
\[ \|C(x_p-x^*)\|\leqslant q^p\|C(x_0-x^*)\|,\qquad \|x_p^n-x^*\|\leqslant q^p\|y_0\|/m, \tag{4} \]
where \(q=(M-m)/(M+m)\); \(M\geqslant m\) are the bounds of the operator \(U\) in \(H\).
- If, moreover, the operator \(U\) is \((v)\)-homogeneous, then system (3) assumes the simplest form:
\[ [Uy_p,y_pe_n]\varepsilon_{np}=[y_p,y_pe_n]\qquad (n\ne l_i), \]
the coefficients \(\varepsilon_{np}\) here are real. In this case the estimates (4) of convergence of the method can be strengthened:
\[ \|e_n C(x_p-x^*)\|\leqslant q_n^p\|e_n C(x_0-x^*)\|, \]
\[ \|(x_p-x^*)e_n\|\leqslant q_n^p\|y_0e_n\|/m_n\qquad (n=1,\ldots,m), \tag{5} \]
where \(q_n=(M_n-m_n)/(M_n+m_n)\); \(M_n\geqslant m_n\) are the bounds of the operator \(U\) in the subspace \(H_n\subset H\) of all \(x\) that coincide with their projections \(xe_n\). Obviously, \(M\geqslant M_n\geqslant m_n\geqslant m,\ q_n\leqslant q\).
Below we shall establish conditions sufficient for carrying out a stationary process (7) of approximations, which is a modification of algorithm (2).
- If the additive operation \(A\) maps some \(K\)-space \(Z\) into itself, is positive (in the sense of the structure) in it, and there also exists a posi-
a positive operation \((I-A)^{-1}\), defined on all of \(Z\), then the series \(\sum_{p=0}^{\infty} A^{p}z\) \((o)\)-converges for every \(z\).
-
If in some space \(X\) of type \(B_K\) with norms \(x \in Z\) (in particular, in \(X=B_V\)) a regular operation \(T\) has an \((o)\)-linear majorant \(A \geq T\), and moreover \((I-A)^{-1} > 0\) in \(Z\), then the algorithm \(x_{p+1}=Tx_p+\psi\) \((bk)\)-converges to the solution \(x^*\) of the equation \(x=Tx+\psi\) with rate
\[ x_p-x^* \leq A^p (I-A)^{-1}a \to 0, \tag{6} \]
where \(a=x_0-Tx_0-\psi\). -
In what follows \(X=H_V\) and the operator \(U=U^*\) in \(X=H\). Hence follows the symmetry of the matrix \((u_{kn})\) (see above) and the self-adjointness of the operators \(U_{nn}\). Denote \(e_{kn}=e_k+(1-\delta_{kn})e_n\). Compose the sets \(S_{kn}\) of all \(x=xe_{kn}\) with norm \(\|x\|=1\). Introduce the notation:
\[ m'_{kn}=\inf_{S_{kn}} [Ux,x], \qquad m''_{kn}=\sup_{S_{kn}} [Ux,x], \qquad m_n=m'_{nn}, \qquad M_n=m''_{nn}; \]
\[ q_{kn}=\max_{\nu=1,2} |m^{(\nu)}_{kn}|=\sup_{S_{kn}} |[Ux,x]|, \qquad q_n=q_{nn}. \]
Then
\[ u_{kn}\leq q_{kn}=q_{nk}, \qquad u_{nn}=q_n \quad (u_{kn}=\|U_{kn}\|). \] -
Introduce the operators
\[ (zU)x=\sum_{n=1}^{m}\varepsilon_n e_n Ux, \qquad T=I-zU, \qquad T_n=T_{nn}, \]
where all real numbers \(\varepsilon_n\ne 0\). Then the equations \(Ux=f\) and \(x=Tx+\psi\) (where \(\psi=zf,\ z=\{\varepsilon_n\}\)) are equivalent to one another. The estimates
\[ \|T_n\|=\max_{\nu}|1-\varepsilon_n m_{nn}^{(\nu)}|, \qquad \|T_{kn}\|=|\varepsilon_k|u_{kn}\leq |\varepsilon_k|q_{kn}\quad (k\ne n). \]
are valid. -
Suppose that for the operator \(U\) there is a possibility of such a choice of real coefficients \(\varepsilon_n\ne 0\) and certain numbers \(w_{kn}\geq u_{kn}\) for \(k\ne n\) (for example, \(w_{kn}=q_{kn}\)) that the transformation \((I-A)^{-1}>0\), where \(A\sim(a_{kn})\),
\[ a_{nn}=\max_{\nu}|1-\varepsilon_n m_{nn}^{(\nu)}|, \qquad a_{kn}=|\varepsilon_k|w_{kn}\quad (k\ne n), \]
for example, \(a_{kn}=\max_{\nu}|\delta_{kn}-\varepsilon_k m_{kn}^{(\nu)}|\). Then the algorithm
\[ x_{p+1}=x_p-\sum_{n=1}^{m}\varepsilon_n e_n(Ux_p-f) \tag{7} \]
converges to \(x^*\) with rate (6), where
\[ a=\sum_{n=1}^{m}|\varepsilon_n|\|(Ux_0-f)e_n\|e_n. \] -
If the operator \(H\) is positive definite in \(H\) and \((v)\)-homogeneous, then all \(T_{kn}=0\) for \(k\ne n\), and therefore the choice of the indicated numbers \(\varepsilon_n\) becomes possible, for example, \(\varepsilon_n=2/(M_n+m_n)\). In this case \(T(z)\leq Q\cdot z\), \(Q=\{q_n\}\). We establish the estimates (5), valid also for the process (7) in the case under consideration.
Gorky State University
named after N. I. Lobachevsky
Received
5 IV 1953
CITED LITERATURE
¹ S. N. Slugin, DAN, 147, No. 2, 306 (1962). ² L. V. Kantorovich, B. Z. Vulikh, A. G. Pinsker, Functional Analysis in Semi-Ordered Spaces, 1950. ³ S. N. Slugin, DAN, 139, No. 5, 1059 (1961). ⁴ L. V. Kantorovich, G. P. Akilov, Functional Analysis in Normed Spaces, 1959. ⁵ S. G. Mikhlin, Variational Methods in Mathematical Physics, 1957.