UDC 518:512.25
MATHEMATICS
Submitted 1968-01-01 | RussiaRxiv: ru-196801.48108 | Translated from Russian

Full Text

UDC 518:512.25

MATHEMATICS

Corresponding Member of the Academy of Sciences of the USSR G. I. Marchuk, Yu. A. Kuznetsov

ON THE QUESTION OF OPTIMAL ITERATIVE PROCESSES

We shall denote by \(V_n\) the space of \(n\)-dimensional real vectors over the field of real numbers with scalar product
\[ (\varphi,\psi)=\sum_{i=1}^{n}\varphi_i\psi_i\quad(\varphi,\psi\in V_n). \]
Let \(Q\) be some subspace of \(V_n\). A matrix \(B\)* is called positive definite on \(Q\) \(\bigl(B\underset{Q}{>}0\bigr)\) if \((B\varphi,\varphi)>0\) for any \(\varphi\in Q\) \((\varphi\ne\Theta,\ \Theta\) is the zero vector), and positive semidefinite on \(Q\) \(\bigl(B\underset{Q}{\ge}0\bigr)\) if \((B\varphi,\varphi)\ge0\) for all \(\varphi\in Q\). The spectral radius of the matrix \(B\) will be denoted by \(\rho(B)\). In the present work all matrices are assumed to be real.

Let, for the solution of the system of linear algebraic equations
\[ A\varphi=f, \tag{1} \]
where \(A\) is a nonsingular matrix and \(f\in V_n\), the iterative process
\[ \varphi^{k+1}=\varphi^k-B\left[\sum_{i=1}^{p}\gamma_{ik}(AB)^{i-1}\right](A\varphi^k-f), \quad \varphi^k\in U^0 \quad (k=0,1,\ldots). \tag{2} \]
is proposed.

Here \(B\) is some matrix for which \(B(A\varphi-f)\ne\Theta\) for any \(\varphi\in U^0\) \((\varphi\ne A^{-1}f)\), \(\gamma_{ik}=\gamma_i(\xi^k)\) \((\xi^k=A\varphi^k-f)\), and the \(\gamma_i\) belong to a set \(G\) of real functionals defined on any \(\xi\in V_n\) \((\xi\ne\Theta,\ i=1,\ldots,p)\). With respect to \(U^0\) we shall assume that this is a closed set of the space \(V_n\) such that \(U_A=\{\psi:\psi=A\varphi-f,\ \varphi\in U^0\}\) is a linear subspace of \(V_n\) and, for any \(\varphi\in U^0\) and any real numbers \(\alpha_i\) \((i=1,\ldots,p)\), the vector
\[ \psi=\varphi-B\left[\sum_{i=1}^{p}\alpha_i(AB)^{i-1}\right](A\varphi-f) \]
belongs to \(U^0\).

By convergence of the iterative process (2) in \(U^0\) we shall mean convergence of the sequence \(\{\varphi^k\}\) to the vector \(\varphi^*=A^{-1}f\), which is the unique solution of system (1), for any initial approximation
\[ R_p(\gamma)=I-B\left[\sum_{i=1}^{p}\gamma_i(AB)^{i-1}\right]A \]
\(\varphi^0\in U^0\). The operator \((\gamma_i\in G;\ i=1,\ldots,p,\ I\) is the identity matrix) is called the transition operator of the iterative process (2). We introduce in \(U_A\) the norm \(\|\varphi\|_D=(D\varphi,\varphi)^{1/2}\), where \(D=D^* \underset{U_A}{>}0\) is some matrix, and in the space \(U=\{\psi:\psi=\varphi-\varphi^*,\ \varphi\in U^0\}\) the norm \(\|\psi\|_{DA}=\|A\psi\|_D\).

For given matrices \(A\), \(B\), and \(D\), the problem of optimizing the iterative process (2) in the space \(U\) with the norm \(\|\ \|_{DA}\) consists in finding functionals \(\gamma_i\in G\) \((i=1,\ldots,p)\) such that

* Here and below, any matrix whose dimensions are not specified is assumed to be square of order \(n\).

\[ \left\| R_p(\overset{0}{\gamma}) \right\|_{DA}^{U} = \inf_{\substack{\gamma_i \in G\\ 1 \le i \le p}} \left\| R_p(\gamma) \right\|_{DA}^{U}, \tag{3} \]

where

\[ \left\| R_p(\gamma) \right\|_{DA}^{U} = \sup_{\psi \in U} \left( \left\| R_p(\gamma)\psi \right\|_{DA}/\|\psi\|_{DA} \right). \]

It is not difficult to see that the problem of constructing such functionals is equivalent to finding, for any \(\xi \in U_A\), a vector \(\bar{\gamma}=\gamma(\xi)\in V_p\) for which

\[ J_D(\bar{\gamma},\xi)=\inf_{\gamma\in V_p} J_D(\gamma,\xi). \tag{4} \]

Here

\[ J_D(\gamma,\xi)= \left\| \left( I-\sum_{i=1}^{p}\gamma_i S^i \right)\xi \right\|_D \]

and \(S=AB\). Problem (4) is solvable for any \(\xi\in U_A\) \((\xi\ne \Theta)\), and the set of its solutions coincides with the set of solutions of the system

\[ C^{(p)}\gamma=b^{(p)}, \tag{5} \]

where \(C^{(m)}\) is an \(m\times m\) matrix with elements

\[ c_{ij}=(DS^i\xi,S^j\xi)\quad (1\le i,j\le m) \]

and \(b^{(m)}\in V_m\) is the vector with components

\[ b_i=(DS^i\xi,\xi)\quad (i=1,\ldots,m;\ m=1,\ldots,p), \]

which is always consistent.

Since the matrix \(C^{(p)}\) may be singular, problem (3) has, in general, a nonunique solution. Let \(Q_m\) be the set of vectors \(\xi\in U_A\) for which the vectors \(\{S^i\xi\}_{i=1}^{m}\) are linearly independent and

\[ S^{m+1}\xi=\sum_{i=1}^{m}a_iS^i\xi \quad (m=1,\ldots,p-1) \]

and

\[ Q_p=\{\xi:\ \xi\notin Q_m\ (m=1,\ldots,p-1),\ \xi\in U_A,\ \xi\ne\Theta\}. \]

Then the vector \(\bar{\gamma}\) with components

\[ \bar{\gamma}_i=\sum_{j=1}^{p}\tilde c_{ij} b_j^{(p)} \quad (i=1,\ldots,p), \]

where \(\tilde c_{ij}\) are the elements of the \(p\times p\) matrix \(\widetilde C\), defined by the relation

\[ \widetilde C= \begin{vmatrix} [C^{(m)}]^{-1} & 0\\ 0 & 0 \end{vmatrix}, \]

if \(\xi\in Q_m\) \((1\le m\le p)\), is a solution of system (5).

Thus, defining the values of the functionals \(\overset{0}{\gamma_i}\) by the relations

\[ \overset{0}{\gamma_i}(\xi)=\bar{\gamma}_i(\xi) \]

for all \(\xi\in U_A\) \((\xi\ne\Theta)\) and

\[ \overset{0}{\gamma_i}(\xi)=0 \]

for all \(\xi\notin U_A\) and \(\xi=\Theta\) \((i=1,\ldots,p)\), we obtain a solution of problem (3). Process (2) with functionals \(\overset{0}{\gamma_i}\) \((i=1,\ldots,p)\) constructed in this way will be called optimal.

It is not difficult to show that \(J_D(\gamma,\xi)\) is a continuous functional on the set

\[ Q_A=\{\psi:\ \psi=\varphi/\|\varphi\|,\ \varphi\in U_A\} \]

and that \(J_D(\overset{0}{\gamma},\xi)<1\) \((\xi\in U_A,\ \xi\ne\Theta)\) if and only if

\[ \max_{1\le i\le p} |(DS^i\xi,\xi)|>0. \]

Then, since

\[ \left\|R_p(\overset{0}{\gamma})\right\|_{DA}^{U} = \inf_{\psi\in Q_A} J_D(\gamma,\psi), \]

the following is valid.

Theorem 1. Let

\[ \alpha_D(s)= \inf_{Q_A}\left[\max_{1\le i\le p}|(DS^i\psi,\psi)|\right]. \]

Then \(\alpha_D(s)<0\) is a necessary and sufficient condition for convergence in \(U^0\) of the optimal iterative process (2); moreover, if \(\alpha_D(S)>0\), then

\[ \left\|R_p(\gamma)\right\|_{DA}^{U}<1. \]

Concerning the implementation of the optimal iterative process (2), let us note that if, for some \(k \ge 0\), the vector \(\xi^k \in Q_p\), then the matrix \(C^{(p)}\) is nonsingular and \(\gamma(\xi^k)=[C^{(p)}]^{-1}b^{(p)}\), and if \(\xi^k \notin Q_p\), \((\xi^k \in U_A,\ \xi^k \ne \Theta)\) and there is some solution of system (5), then

\[ \varphi^*=\varphi_k-B\left[\sum_{i=1}^{p}\widetilde{\gamma}_i S^{i-1}\right]\xi^k, \]

and the iterative process terminates, converging to the exact solution in a finite number of steps.

Since for \(p_2>p_1\ge 1\) we have
\[ \|R_{p_2}(\overset{0}{\gamma})\|_{D_A}^{U}\le \|R_{p_1}(\overset{0}{\gamma})\|_{D_A}^{U}, \]
it follows that

Lemma 1. If the \(p_1\)-step optimal process (2) converges in \(U^0\), then the \(p_2\)-step process also converges for any \(p_2>p_1\).

Let \(U_A \perp \Phi_D\), where \(\Phi_D\) is the null space of the matrix \(D\) (this, for example, always holds if \(D\) is nonsingular). Then \(\xi=[D^+]^{1/2}D^{1/2}\xi\) and
\[ (DS^i\xi,\xi)=(\Lambda^i\psi,\psi)\quad (i=1,\ldots,p) \]
for any \(\xi\in U_A\). Here \([D^+]^{1/2}\) is a pseudoinverse matrix for the matrix \(D^{1/2}\) (see (2)), \(\psi=D^{1/2}\xi\), and \(\Lambda=D^{1/2}AB[D^+]^{1/2}\). Since in this case

\[ \omega_D(S)=\inf_{\psi\in U_D}\left[\max_{1\le i\le p}|(\Lambda^i\psi,\psi)|/(\psi,\psi)\right], \]

where \(U_D=\{\psi:\psi=D^{1/2}\varphi,\ \varphi\in U_A\}\), we have

Lemma 2. \(\Lambda^{i_0}>0\) for some \(1\le i_0\le p\) is a sufficient condition for convergence in \(U^0\) of the optimal iterative process (2).

Let us consider some special cases.

  1. Let \(U_A\perp \Phi_D\) and \(\Lambda=\Lambda^*\). Then for any \(p\ge 1\) the condition \(U_D \perp \Phi_\Lambda=0\), where \(\Phi_\Lambda\) is the null space of the matrix \(\Lambda\), is necessary for convergence in \(U^0\) of the optimal iterative process (2). This condition is also sufficient for its convergence in \(U^0\) for any \(p\ge 2\) in the case of an arbitrary Hermitian matrix \(\Lambda\), and for any \(p\ge 1\) when, in addition,

\[ \Lambda>0. \]

If \(\Lambda>0\) and \(U_D\perp \Phi_A\), then for the norm of the operator \(R_p(\overset{0}{\gamma})\) in \(U\) the estimate

\[ \|R_p(\overset{0}{\gamma})\|_{D_A}^{U}\le 1/|T_p(\tau_0)|, \tag{6} \]

is valid, where
\[ T_p(\tau)=\left[(\tau+\sqrt{\tau^2-1})^p+(\tau-\sqrt{\tau^2-1})^p\right]/2 \]
is the Chebyshev polynomial of order \(p\), \(\tau_0=(M+m)/(M-m)\), \(M=\rho(\Lambda)\), and \(m=1/\rho(\Lambda^+)\). For the case of an arbitrary Hermitian matrix \(\Lambda\), \(U_D\perp\Phi_A\), and \(p\ge 2\), the estimate has the form

\[ \|R_p(\overset{0}{\gamma})\|_{D_A}^{U}\le 1/|T_{\bar p}(\bar\tau_0)|. \]

Here \(\bar p=[p/2]\) and \(\bar\tau_0=(\bar M+\bar m)/(\bar M-\bar m)\), where \(\bar M=\rho(\Lambda^2)\) and \(\bar m=1/\rho([\Lambda^+]^2)\). If \(A=A^*>0\), \(B=I\), and \(D=A^{-1}\), then we arrive at the \(p\)-step method of steepest descent proposed by L. V. Kantorovich (7), and estimate (6) is given, for example, in monograph (1).

Under the assumptions made at the beginning of this section, for the implementation of one step of the optimal iterative process (2) one can propose the following modification of the conjugate-gradient method (see (1, 6)). Let, for some \(k\ge 0\), vectors \(\varphi^k\) and \(\xi^k=A\varphi^k-f\) be given. Then we obtain the vector \(\varphi^{k+1}\) by the following formulas

\[ \varphi^{k+i/p}=\varphi^{k+(i-1)/p}-\gamma_i r_{i-1}^k,\qquad \gamma_i=\frac{(D(A\varphi^{k+(i-1)/p}-f),\,Ar_{i-1}^k)}{(DAr_{i-1}^k,\,Ar_{i-1}^k)}, \]

\[ (i=1,\ldots,p), \]

where

\[ r_0^k=B\xi^k,\qquad r_i^k=B(A\varphi^{k+i/p}-f)-\beta_i r_{i-1}^k, \]

\[ \beta_i=\frac{(DAB(A\varphi^{k+i/p}-f),\,Ar_{i-1}^k)}{(DAr_{i-1}^k,\,Ar_{i-1}^k)},\qquad (i=1,\ldots,p-1). \]

The latter is not difficult to show, since
\[ (DAr_i^k,Ar_j^k)=0\quad \text{for } i\ne j \]
\[ (Ar_i^k,Ar_j^k\in U_A,\ 1\le i,j\le p). \]
The implementation scheme proposed above ...

makes it possible to substantially increase the efficiency of the iterative processes under consideration when they are applied in practice.

Example. Let \(A=\widetilde B-C,\ A \underset{V_n}{>} 0\), and \(U_A=\{\psi:\psi=C\varphi,\ \varphi\in V_n\}\), where

\[ \widetilde B= \begin{vmatrix} F_1 & 0\\ -F_2 & F_1^* \end{vmatrix}_{V_n}>0, \qquad C= \begin{vmatrix} L & L\\ L & L \end{vmatrix} =C^* \underset{V_n}{\geq} 0, \]

and \(F_1,\ F_2=F_2^*\) and \(L=L^*\) are \(n/2\times n/2\) matrices. (Systems of equations with such matrices arise in the numerical solution of kinetic equations [4].) Choose for process (2) \(B=\widetilde B^{-1}\) and \(D=C^+\). Since \(U_A\perp \Phi_D\) and \(\rho(BC)=\|BC\|_{DA}^{U}<1\), it is easy to see that \(\Lambda=D^{1/2}AB[D^+]^{1/2}=\Lambda^* \underset{U_n}{>}0\), and, consequently, the optimal iterative process (2) converges in

\[ U^0=\{\varphi:\varphi=A^{-1}\psi+J,\ \psi\in U_A\} \]

for any \(p\geq 1\). Note that for any \(f\in V_n\) the vector \(Bf\in U^0\), since \(ABf-f=-CBf\in U_A\). Using (6), it is not difficult to show that

\[ \lim_{\rho(BC)\to 1}\ln \rho(BC)/\ln \left\|R_p(\gamma^0)\right\|_{DA}^{U} \leq 1/2p^2 \]

for any fixed \(p\geq 1\). In other words, in this case asymptotically one step of the optimal iterative process (2) is at least equivalent to \(2p^2\) steps of the simple iteration method

\[ \widetilde B\varphi^{k+1}=C\varphi^k+f,\qquad \varphi^k\in V_n\quad (k=0,1,\ldots). \]

Let us note that, to implement the iterative process (2), knowledge of the matrix \(C^+\) is not necessary if the vector \(Bf\) is chosen as the initial one.

  1. Let \(B_\tau=\prod_{i=1}^{l}(I+\tau B_i)\) be a nonsingular matrix, \(\tau\) a real parameter, \(B=B_\tau^{-1}\), and \(D=I\). Iterative processes with matrices \(B\) of this form arise when solving system (1) by the splitting method (see [4, 5]).

Lemma 3. If \(A \underset{V_n}{>} 0\), then there exists \(\tau_0>0\) such that for any \(\tau\in[0,\tau_0]\) the matrix \(B_\tau\) is nonsingular and the optimal iterative process (2) converges in \(V_n\) for all \(p\geq 1\).

Suppose that \(l=2,\ A=B_1+B_2,\ B_1 \underset{V_n}{>}0;\ B_2 \underset{V_n}{>}0\), and choose

\[ D=[(I+\tau B_1)^{-1}]^*(I+\tau B_1)^{-1}. \]

Lemma 4. Under the assumptions made above, the optimal iterative process (2) converges in \(V_n\) for any \(\tau>0\) and \(p\geq 1\).

In conclusion, the authors express their gratitude to S. G. Godunov and V. I. Lebedev for valuable comments.

Computing Center
of the Siberian Branch of the Academy of Sciences of the USSR
Received
27 V 1968

REFERENCES

  1. D. K. Faddeev, V. N. Faddeeva, Computational Methods of Linear Algebra, Moscow, 1963.
  2. F. R. Gantmacher, The Theory of Matrices, Moscow, 1966.
  3. G. I. Marchuk, N. N. Yanenko, Proceedings of the Symposium on Applied and Computational Mathematics, Novosibirsk, 1965.
  4. G. I. Marchuk, Uspekhi Mat. Nauk, 161, No. 1, 66 (1965).
  5. A. A. Samarskii, Zh. Vychisl. Mat. i Mat. Fiz., 6, No. 6, 665 (1966).
  6. J. M. Daniel, SIAM J. Num. Anal., 4, No. 1, 10 (1967).
  7. L. V. Kantorovich, Uspekhi Mat. Nauk, 3, No. 2, 89 (1948).

Submission history

UDC 518:512.25