Abstract
Full Text
MATHEMATICS
N. P. SOKOLOV
ON OPERATIONS ON MULTIDIMENSIONAL MATRICES
(Presented by Academician Yu. V. Linnik on 4 II 1965)
The operations of addition of multidimensional matrices and of multiplying them by a scalar are well known. The operation of matrix multiplication is defined analogously to Cayley’s rule for multiplication of multidimensional determinants and is extended to several pairs of indices of the matrices being multiplied, which, however, introduces certain restrictions into the notions of the identity and inverse matrices (see \((^1)\), p. 624; \((^2)\), p. 434). In the present work a broader definition is given of the operation of multiplication of multidimensional matrices* in accordance with Cayley’s and Scott’s rules (see \((^3)\), pp. 57–61) for multiplying determinants. In connection with this, a more general definition is given of the identity and inverse matrices, eliminating the restrictions mentioned above and making it possible to consider identity and inverse matrices of both even and odd numbers of dimensions. An application of these generalizations to the solution of matrix equations is indicated, in particular to finding proper matrices for any multidimensional matrix. At the same time the general Hamilton–Cayley theorem is proved and normal forms of multidimensional matrices are established.
- Let \(p, q\) be any positive integers and let \(\varkappa, \lambda, \mu, \nu\) be nonnegative integers satisfying the conditions
\[ \varkappa+\lambda+\mu=p,\qquad \lambda+\mu+\nu=q. \tag{1} \]
Take a \(p\)-dimensional matrix of order \(n\),
\(A=\|A_{i_1\ldots i_p}\|\) \((i_1,\ldots,i_p=1,\ldots,n)\). Denoting by
\(l=(l_1,\ldots,l_{\varkappa})\), \(s=(s_1,\ldots,s_\lambda)\),
\(c=(c_1,\ldots,c_\mu)\) partitions of the set of indices \(i_1,\ldots,i_p\) of the matrix \(A\), we can represent this matrix in the form
\(A=\|A_{lsc}\|\). Associate with the matrix \(A\) the \(p\)-linear form
\[ F=\sum_1^n A_{l_1\ldots l_{\varkappa}s_1\ldots s_\lambda c_1\ldots c_\mu} x_{l_1}^{(1)}\ldots x_{l_{\varkappa}}^{(\varkappa)} y_{s_1}^{(1)}\ldots y_{s_\lambda}^{(\lambda)} z_{c_1}^{(1)}\ldots z_{c_\mu}^{(\mu)}. \]
and subject it to a homogeneous transformation
\[ y_{s_1}^{(1)}\ldots y_{s_\lambda}^{(\lambda)} z_{c_1}^{(1)}\ldots z_{c_\mu}^{(\mu)} = \sum_{m_1,\ldots,m_\nu=1}^{n} B_{c_1\ldots c_\mu s_1\ldots s_\lambda m_1\ldots m_\nu} Y_{s_1}^{(1)}\ldots Y_{s_\lambda}^{(\lambda)} Z_{m_1}^{(1)}\ldots Z_{m_\nu}^{(\nu)} \]
with a \(q\)-dimensional matrix of order \(n\),
\(B=\|B_{c_1\ldots c_\mu s_1\ldots s_\lambda m_1\ldots m_\nu}\|\), which, with the aid of the partitions \(c, s\), and \(m=(m_1,\ldots,m_\nu)\), we write in the form
\(B=\|B_{csm}\|\). As a result of the transformation, taking into account the equality
\(\varkappa+\lambda+\nu=p+q-\tau\), where \(\tau=\lambda+2\mu\), we obtain the \((p+q-\tau)\)-linear form
\[ \Phi=\sum_1^n C_{l_1\ldots l_{\varkappa}s_1\ldots s_\lambda m_1\ldots m_\nu} x_{l_1}^{(1)}\ldots x_{l_{\varkappa}}^{(\varkappa)} Y_{s_1}^{(1)}\ldots Y_{s_\lambda}^{(\lambda)} Z_{m_1}^{(1)}\ldots Z_{m_\nu}^{(\nu)}, \]
* Matrices over the field of complex or real numbers are considered. Other number fields are possible, however.
where
\[ C_{l_1\ldots l_\lambda s_1\ldots s_\lambda m_1\ldots m_\nu} = \sum_{c_1,\ldots,c_\mu=1}^{n} A_{l_1\ldots l_\lambda s_1\ldots s_\lambda c_1\ldots c_\mu} B_{c_1\ldots c_\mu s_1\ldots s_\lambda m_1\ldots m_\nu}. \]
It corresponds to a \((p+q-\tau)\)-dimensional matrix of order \(n\)
\[ C=\|C_{lsm}\|,\qquad C_{lsm}=\sum_{(c)} A_{lsc} B_{csm}. \]
We shall call the matrix \(C\) the \((\lambda,\mu)\)-contracted product of the matrix \(A\) by \(B\). Denoting it by the symbol \({}^{\lambda,\mu}(AB)\), we obtain
\[ {}^{\lambda,\mu}(AB)=\left\|\sum_{(c)} A_{lsc} B_{csm}\right\|. \tag{2} \]
In what follows, the partition indices \(c\), over which summation is performed, are called Kellian, and the connecting partition indices \(s\) are called Scottian (see (3), p. 60).
In the absence of Scottian indices, when \(\lambda=0\) and \(\tau=2\mu\), the product (2) becomes the \((0,\mu)\)-contracted product \({}^{0,\mu}(AB)\), considered by Oldenburger \((^1)\), and also by Guare and Samuel \((^2)\).
From the equalities (1) it follows that \(0\le \tau\le p+q\). If \(\tau=0\), then \(\lambda=0\), \(\mu=0\), and we have the product \({}^{0,0}(AB)\) without contraction.* If \(\tau=p+q\), then \(\lambda=0\), \(\mu=p=q\), and we have a completely contracted product \({}^{0,p}(AB)\) in the form of a scalar.
Let us also note the case when \(\tau=p=q\) and \({}^{\lambda,\mu}(AB)\) is represented by a \(p\)-dimensional matrix. If in addition \(\mu=0\), then we obtain the \((p,0)\)-contracted product \({}^{p,0}(AB)\).**
Matrix multiplication is associative under the condition that the contracted sets of Kellian indices do not intersect either with one another or with the sets of Scottian indices.
Remark. In defining the \((\lambda,\mu)\)-contracted product of two matrices, in order to simplify the notation the contraction was carried out with respect to the extreme indices of these matrices. With equal success, however, contraction may be carried out with respect to any indices.
- Define, in the product (2), the matrix \(B=\|B_{j_1\ldots j_q}\|\) \((j_1,\ldots,j_q=1,\ldots,n)\) so that \({}^{\lambda,\mu}(AB)=A\). For this it is necessary that \(q=\tau\), where \(\tau\), in view of the inequality \(\lambda+\mu\le p\) following from (1), satisfies the condition \(0\le\tau\le2p\). Denote in this case the matrix \(B\) by \(E=\|E_{csm}\|\), where \(m=(m_1,\ldots,m_\mu)\).
Let, further, \(E_{csm}=\delta_{cm}\), where \(\delta_{cm}\) is the Kronecker symbol. Then
\[ {}^{\lambda,\mu}(AE)=\left\|\sum_{(c)} A_{lsc}E_{csm}\right\|=A. \tag{3} \]
The matrix \(E\), satisfying condition (3), is called the \((\lambda,\mu)\)-identity matrix and, more fully, is denoted by \(E(\lambda,\mu)\). As is easily verified, the equality \({}^{\lambda,\mu}(EA)=A\) also holds. The number of ones in the matrix \(E(\lambda,\mu)\) is \(n^{\lambda+\mu}\). In particular, in the matrix \(E(\tau,0)\) every element is equal to 1.
- A matrix of order \(n\), \(A^{-1}\), is called the \((\lambda,\mu)\)-inverse matrix for \(A\) relative to \(E(\Lambda,\mathrm{M})\), if it satisfies the condition
\[ {}^{\lambda,\mu}(AA^{-1})=E(\Lambda,\mathrm{M}), \]
where \(\lambda+2\mu=\tau\), \(\Lambda+2\mathrm{M}=T\), and each of the sums \(\lambda+\mu\), \(\Lambda+\mathrm{M}\) does not exceed \(p\). More fully this matrix is written in the form \(A^{-1}(\lambda,\mu)\). The number of its dimensions is \(q=T+\tau-p\). The existence of the matrix \(A^{-1}(\lambda,\mu)\) is conditioned by the existence of solutions
* Or \(A\times B\), according to H. Weyl’s notation for square matrices.
** Or \(A\circ B\), according to Hadamard’s notation for square matrices.
systems of \(n^T\) linear equations with \(n^{T+\tau-p}\) unknowns
\[ \sum_{(c)} A_{lsc}X_{csk}=E_{lsk}, \tag{4} \]
where the \(T\) indices of the aggregate \(lsk\) form the aggregate of indices \(c_1\ldots c_Ms_1\ldots s_\Lambda m_1\ldots m_M\) of the matrix \(E(\Lambda,\mathrm M)\). The number of equations of the system (4) is equal to the number of unknowns if \(\tau=p\). Then \(q=T\), with \(p/2\le T\le 2p\). In this case the partitions \(l\) and \(c\) contain one and the same number \(\mu\) of indices. Denote by \(N_1,\ldots,N_{n^\mu}\) the aggregates, arranged in normal order, of \(\mu\) values of the indices of each of these partitions, and by \(L_1,\ldots,L_{n^\lambda}\) the aggregates of \(\lambda\) values of the indices of the partition \(s\). In the aggregate of indices \(k=(k_1,\ldots,k_\nu)\), the integer \(\nu\) satisfies the condition \(\lambda+\mu+\nu=T\). The determinant of the system of equations (4) will then be equal to \(\Delta_\nu^n(\lambda,\mu)\), where
\[ \Delta(\lambda,\mu)=\prod_{h=1}^{n^\lambda}\left|A_{\rho_0L_h\rho_h}\right| \quad (\rho_0,\rho_h=N_1,\ldots,N_{n^\mu}). \]
If \(\Delta(\lambda,\mu)\ne0\), then the system of equations (4) has a unique solution, determined by Cramer’s formulas. It gives the sought elements \(X_{csk}\) of the matrix \(A^{-1}(\lambda,\mu)\).
The matrix \(A^{-1}(\lambda,\mu)\) considered above is the right \((\lambda,\mu)\)-inverse matrix for \(A\) relative to \(E(\Lambda,\mathrm M)\). A left matrix of the same kind is defined analogously and, in the general case, obviously differs from the right one.
- Consider the matrix equation
\[ {}_{\lambda,\mu}(AX)=B, \tag{5} \]
where \(A,X,B\) are matrices of order \(n\) and of \(p,q,r\) dimensions, while \(\lambda,\mu\) are nonnegative integers whose sum does not exceed the numbers \(p,q\). From equation (5) it follows that \(q=r-p+\tau\), where \(0\le\tau=\lambda+2\mu\le2p\). Hence we obtain the inequality
\[ r\ge p-\mu, \tag{6} \]
which is a necessary condition for the existence of a solution of equation (5). Assuming this condition to be fulfilled, suppose that we know a left \((\lambda',\mu')\)-inverse matrix \(A^{-1}(\lambda',\mu')\) for \(A\) relative to \(E(\lambda,\mu)\), satisfying the condition
\[ {}_{\lambda',\mu'}(A^{-1}A)=E(\lambda,\mu). \]
Then
\[ {}_{\lambda,\mu}\!\left[{}_{\lambda',\mu'}(A^{-1}A)X\right] = {}_{\lambda,\mu}(EX)=X, \]
whence
\[ X={}_{\lambda',\mu'}(A^{-1}B). \tag{7} \]
Among all possible solutions of equation (5), represented by formula (7), we single out those which correspond to \(1+E\!\left(\frac p2\right)\) pairs of values \(\lambda',\mu'\) satisfying the condition \(\lambda'+2\mu'=p\). For each of such pairs there can exist only one matrix \(A^{-1}(\lambda',\mu')\) appearing in formula (7), and if this matrix exists, then under condition (6) the solution under consideration will be unique.
- A matrix \(X\) of order \(n\), proportional to the \((\lambda,\mu)\)-contracted product of a \(p\)-dimensional matrix \(A\) of order \(n\) by \(X\), is called a \((\lambda,\mu)\)-proper matrix for \(A\). We may therefore write
\[ {}_{\lambda,\mu}(AX)=\alpha X, \tag{8} \]
where \(\alpha\) is a \((\lambda,\mu)\)-characteristic number of the matrix \(A\). From relation (8), denoting by \(q\) the number of dimensions of the matrix \(X\), we find that \(p+q-\tau=q\), i.e. \(p=\tau\), where \(\tau=\lambda+2\mu\), and \(q\) is arbitrary.
integer not less than \(\lambda+\mu\). Relation (8) can be rewritten in the form
\[ {}^{\lambda,\mu}\!\left[(A-\alpha E(\lambda,\mu))X\right]=0, \tag{9} \]
where \(0\) is the zero \(q\)-dimensional matrix of order \(n\). The elements of the matrix \(X\) satisfying equation (9) are found by solving the system of \(n^{\lambda+\mu}\) linear homogeneous equations with \(n^{\lambda+\mu}\) unknowns \(Y_{cs}=X_{csk}\)
\[ \sum_{(c)}(A_{msc}-\alpha E_{msc})Y_{cs}=0. \tag{10} \]
Using the notation of item 3, we rewrite the system of equations (10) in the form
\[ \sum_{\sigma=N_1}^{N_{n^\mu}} (A_{\rho L_h\sigma}-\alpha\delta_{\rho\sigma})Y_{\sigma L_h}=0 \quad (\rho=N_1,\ldots,N_{n^\mu};\ h=1,\ldots,n^\lambda). \tag{11} \]
The determinant \(\Delta_{\lambda,\mu}(\alpha)\) of the system of equations (11) has the quasidiagonal form
\(\Delta_{\lambda,\mu}(\alpha)=|\{\mathfrak A_1(\alpha),\ldots,\mathfrak A_{n^\lambda}(\alpha)\}|\), where
\(\mathfrak A_h(\alpha)=\|A_{\rho L_h\sigma}-\alpha\delta_{\rho\sigma}\|\)
\((h=1,\ldots,n^\lambda)\), and is a polynomial in \(\alpha\) of degree \(n^{\alpha+\mu}\). For the existence of a nonzero solution of the system of equations (11), it is necessary and sufficient that
\[ \Delta_{\lambda,\mu}(\alpha)=0. \tag{12} \]
Equation (12) is called the \((\lambda,\mu)\)-characteristic equation of the matrix \(A\). Its roots are the \((\lambda,\mu)\)-characteristic numbers of this matrix. To each of them there corresponds a nonzero \(q\)-dimensional \((\lambda,\mu)\)-eigenmatrix \(X\) for \(A\), whose elements \(X_{csk}\) with the same indices \(cs\) are identical, i.e., all sections of orientation \((k)\) in the matrix \(X\) are identical.
- From the elements of the \(p\)-dimensional \((p=\lambda+2\mu)\) matrix of order \(n\),
\(A=\|A_{msc}\|\), using the notation of item 5, we form a square matrix of order \(n^{\lambda+\mu}\) (quasidiagonal for \(0<\lambda<p\) and diagonal for \(\lambda=p\))
\(\mathfrak A(\lambda,\mu)=\{\mathfrak A_1,\ldots,\mathfrak A_{n^\lambda}\}\), where
\(\mathfrak A_h=\|A_{\rho L_h\sigma}\|\)
\((\rho,\sigma=N_1,\ldots,N_{n^\mu};\ h=1,\ldots,n^\lambda)\).
The matrix \(\mathfrak A(\lambda,\mu)\) will be called \((\lambda,\mu)\)-associated with \(A\). If the rule for forming the \((\lambda,\mu)\)-contracted product of matrices is observed, the polynomial
\(\Delta_{\lambda,\mu}(\mathfrak A(\lambda,\mu))\), for all possible values of \(\lambda,\mu\), is a matrix \((\lambda,\mu)\)-associated with the \(p\)-dimensional matrix of order \(n\) representing the polynomial \(\Delta_{\lambda,\mu}(A)\), and since the equation
\(\Delta_{\lambda,\mu}(\alpha)=0\) is also the characteristic equation of the matrix \(\mathfrak A(\lambda,\mu)\), it follows that
\(\Delta_{\lambda,\mu}(\mathfrak A(\lambda,\mu))=0\), and therefore also
\(\Delta_{\lambda,\mu}(A)=0\). In a similar way we are convinced that
\(\Delta_{\lambda,\mu}(A_h)=0\) \((h=1,\ldots,n^\lambda)\), where
\(A_h=\|A_{mL_hc}\|\) is a \(2\mu\)-dimensional matrix representing the \(h\)-th section of orientation \((s)\) of the matrix \(A\). Thus the general Hamilton–Cayley theorem holds: every matrix
\(A=\|A_{msc}\|\) of any number of dimensions \(p=\lambda+2\mu\) and all its sections of orientation \((s)\) satisfy the \((\lambda+\mu)\)-characteristic equation of the matrix \(A\).
According to the definition, the \((\lambda,\mu)\)-characteristic numbers of the matrix \(A\) are also characteristic numbers of the matrix \(\mathfrak A(\lambda,\mu)\), and if
\(J(\lambda,\mu)\) is the normal Jordan form of the matrix \(\mathfrak A(\lambda,\mu)\), then the \(p\)-dimensional matrix of order \(n\), \(I(\lambda,\mu)\), with which \(J(\lambda,\mu)\) is \((\lambda,\mu)\)-associated, will be the \((\lambda,\mu)\)-normal form of the matrix \(A\).
Kiev Technological Institute
of Light Industry
Received
24 I 1965
CITED LITERATURE
- R. Oldenburger, Ann. Math. 35, 3, 622 (1934).
- R. Gouarné, J. Samuel, Cahiers de Phys., No. 140, 133 (1962).
- N. P. Sokolov, Spatial Matrices and Their Applications, Moscow, 1960.