Abstract
Full Text
CYBERNETICS AND CONTROL THEORY
M. E. TYLKIN
ON THE HAMMING GEOMETRY OF UNIT CUBES
(Presented by Academician S. L. Sobolev on 20 V 1960)
1. We consider the problem of an isometric embedding of finite metric spaces into unit cubes with the Hamming metric ((^1)). The problem can be reformulated in terms of the algebra of logic, linear programming, the theory of self-correcting codes, and graph theory.
2. Let (A) be some finite metric space of cardinality (l). We shall specify it by its square distance matrix (\overline{A}=|\rho_{qv}|) of order (l), in which the element (\rho_{qv}) is the distance between the (q)-th and (v)-th points in our chosen (arbitrary) numbering of the points of the space (A).
Let (Z) be some set of vertices of the (n)-dimensional unit cube. We shall specify the set (Z) by a binary rectangular matrix (\widetilde{Z}=|z_{ij}|), in which the (q)-th row is the coordinate representation of the (q)-th vertex in our chosen (arbitrary) numbering of the elements of the set (Z). The number of rows in (\widetilde{Z}) is equal to the cardinality of the set (Z), and the number of columns is equal to (n). We shall call the matrix (\widetilde{Z}) the code of the set (Z). We shall call the number
[
r_{qv}=\sum_{j=1}^{n}|z_{qj}-z_{vj}|
]
the distance between the (q)-th and (v)-th rows of the matrix (\widetilde{Z}) (the elements of the set (Z))* . The set of all rows of the matrix (\widetilde{Z}) becomes a finite metric space; we shall denote its distance matrix by the symbol (r(\widetilde{Z})).
We shall say that the distance matrix (\overline{A}) is realized by the code (\widetilde{Z}) if (r(\widetilde{Z})=\overline{A}). There exist nonrealizable distance matrices, for example (\overline{A}_{1}) (see the table of matrices).
In the present note the problem is posed of finding a criterion for realizability and of determining all codes that realize a given distance matrix (\overline{A}), i.e., it is required to find all solutions of the equation (r(\widetilde{Z})=\overline{A}) with respect to (\widetilde{Z}).
Some distance matrices (for example, (\overline{A}_{2})) are realized by two or more codes. We shall call two codes equivalent if they are obtained from one another by a permutation of columns and by componentwise addition modulo 2 of all rows with some binary sequence. We shall call the dimension of a code the number of all its columns containing both binary digits**. Equivalent codes realize one and the same distance matrix; the distance matrix (r(\widetilde{Z})) also does not change under the adjoining to the code (\widetilde{Z}), or the removal from it, of columns consisting only of zeros or only of ones.
Without loss of generality, one may consider only such codes for which: a) the columns, considered from bottom to top as binary representations
* The function (r_{qv}) defines the Hamming metric on the unit cube.
** In other words, the dimension of the code (Z) is the minimal among the dimensions of unit subcubes containing the set (Z).
natural numbers, are arranged in nonincreasing order of these numbers; b) the first row consists only of zeros; c) the number of columns of the code is equal to its dimension.
We shall call realizations of a given distance matrix (\overline A) only those codes realizing (\overline A) that satisfy conditions a), b), and c).
Every realizable distance matrix has only a finite number of realizations; there may be more than one of them ((\overline A_3)); in that case one may speak of the spectrum of dimensions of realizations*. The spectrum may be lacunary (for example, (\overline A_4) has realizations of dimensions 4 and 6, but has no realizations of dimension 5). Some distance matrices also have isomers—different realizations of the same dimension ((\overline A_5)).
Table of distance matrices
| Distance matrices | Realizations |
|---|---|
| (\displaystyle \overline A_1=\begin{pmatrix}0&1&1\[2pt]1&0&1\[2pt]1&1&0\end{pmatrix}) | — |
| (\displaystyle \overline A_2=\begin{pmatrix}0&2\[2pt]2&0\end{pmatrix}) | (\displaystyle \begin{pmatrix}0&0\[2pt]1&1\end{pmatrix},\quad \begin{pmatrix}1&0&1\[2pt]1&1&0\end{pmatrix},\quad \begin{pmatrix}1&1&0&0&1\[2pt]1&1&0&1&0\end{pmatrix},\ldots) |
| (\displaystyle \overline A_3=\begin{pmatrix}0&2&2&2\[2pt]2&0&2&2\[2pt]2&2&0&2\[2pt]2&2&2&0\end{pmatrix}) | (\displaystyle \begin{pmatrix}0&0&0\[2pt]0&1&1\[2pt]1&0&1\[2pt]1&1&0\end{pmatrix},\quad \begin{pmatrix}0&0&0&0\[2pt]0&0&1&1\[2pt]0&1&0&1\[2pt]1&0&0&1\end{pmatrix}) |
| (\displaystyle \overline A_4=\begin{pmatrix}0&3&3&3&3\[2pt]3&0&2&2&2\[2pt]3&2&0&2&2\[2pt]3&2&2&0&2\[2pt]3&2&2&2&0\end{pmatrix}) | (\displaystyle \begin{pmatrix}0&0&0&0\[2pt]0&1&1&1\[2pt]1&0&1&1\[2pt]1&1&0&1\[2pt]1&1&1&0\end{pmatrix},\quad \begin{pmatrix}0&0&0&0&0&0\[2pt]0&0&0&1&1&1\[2pt]0&0&1&0&1&1\[2pt]0&1&0&0&1&1\[2pt]1&0&0&0&1&1\end{pmatrix}) |
| (\displaystyle \overline A_5=\begin{pmatrix}0&4&2&2&2\[2pt]4&0&2&2&2\[2pt]2&2&0&2&2\[2pt]2&2&2&0&2\[2pt]2&2&2&2&0\end{pmatrix}) | (\displaystyle \begin{pmatrix}0&0&0&0\[2pt]1&1&1&1\[2pt]0&0&1&1\[2pt]0&1&0&1\[2pt]1&0&0&1\end{pmatrix},\quad \begin{pmatrix}0&0&0&0\[2pt]1&1&1&1\[2pt]0&0&1&1\[2pt]0&1&0&1\[2pt]0&1&1&0\end{pmatrix}) |
| (\displaystyle \overline A_{10}=\begin{pmatrix}0&6&6&6&6\[2pt]6&0&6&6&6\[2pt]6&6&0&6&6\[2pt]6&6&6&0&6\[2pt]6&6&6&6&0\end{pmatrix}) | (\displaystyle \begin{pmatrix}0&0&0&0&0&0&0&0&0&0\[2pt]0&0&0&0&1&1&1&1&1&1\[2pt]0&1&1&1&0&0&0&1&1&1\[2pt]1&0&1&1&0&1&1&0&0&1\[2pt]1&1&0&1&1&0&1&0&1&0\end{pmatrix},\quad \begin{pmatrix}0&0&0&0&0&0&0&0&0&0&0&0&0&0&0\[2pt]0&0&0&0&0&0&0&0&1&1&1&1&1&1&1\[2pt]0&0&0&0&0&1&1&1&0&0&0&1&1&1&1\[2pt]0&0&0&1&1&1&0&0&0&0&0&0&1&1&1\[2pt]1&1&1&0&0&0&0&0&0&0&0&0&1&1&1\end{pmatrix}) |
| (\displaystyle \overline A_6=\begin{pmatrix}0&1&2&1&2\[2pt]2&0&1&2&1\[2pt]1&2&0&1&2\[2pt]2&1&2&0&1\[2pt]1&2&1&2&0\end{pmatrix}) | |
| (\displaystyle \overline A_7=\begin{pmatrix}0&6&2&2&2\[2pt]6&0&2&2&2\[2pt]2&2&0&2&2\[2pt]2&2&2&0&2\[2pt]2&2&2&2&0\end{pmatrix}) | |
| (\displaystyle \overline A_8=\begin{pmatrix}0&4&2&2&2&2\[2pt]4&0&2&2&2&2\[2pt]2&2&0&2&2&2\[2pt]2&2&2&0&2&2\[2pt]2&2&2&2&0&2\[2pt]2&2&2&2&2&0\end{pmatrix}) | |
| (\displaystyle \overline A_9=\begin{pmatrix}0&2&2&2&2&3\[2pt]2&0&2&2&2&1\[2pt]2&2&0&2&2&1\[2pt]2&2&2&0&2&1\[2pt]2&2&2&2&0&1\[2pt]3&1&1&1&1&0\end{pmatrix}) |
- Solving the equation (r(\widetilde Z)=\overline A) consists in finding, by choosing (n), all solutions of the following system of (\frac12 l(l-1)) nonlinear equations
* In application to the problem of optimal coding with error correction, it is important to find both bounds of the spectrum.
with (nl) binary unknowns:
[
\sum_{j=1}^{n} |z_{qj}-z_{vj}|=\rho_{qv},\qquad 1\leq q<v\leq l.
\tag{1}
]
Here (l) is the cardinality of the given space (\overline A). In principle the problem can be solved by enumerating all subsets of vertices of the unit cube of sufficiently large dimension. Using the presence of identical columns in a realization, this enumeration can be reduced. For this purpose it is convenient to pass to a new interpretation of realizations.
Let a realization (\widetilde Z) with (l) rows be given. In the sequence of (2^{l-1}-1) nonnegative integers (X(\widetilde Z)={x_1,\ldots,x_{2^{l-1}-1}}), let the element (x_i) be equal to the number of columns of the realization (\widetilde Z) that are the binary notation (from bottom to top) of the number (i). We shall call (X(\widetilde Z)) the characteristic sequence of the realization (\widetilde Z). Denote by the symbol (p_j^i) the coefficient of (2^j) in the binary notation of the number (i). Denote by the symbol (m_i) the binary norm of the number (i), i.e., the sum of the digits in the binary notation of the number (i).
Theorem 1. Let (\widetilde Z=|z_{ij}|) be a realization having (l) rows and (n) columns. Let (X(\widetilde Z)={x_1,\ldots,x_{2^{l-1}-1}}) be the characteristic sequence of the realization (\widetilde Z).
For every realization (\widetilde Z) the following relations hold:
[
\sum_{j=1}^{n} |z_{qj}-z_{vj}|
=
\sum_{i=1}^{2^{l-1}-1} x_i |p_{l-q}^i-p_{l-v}^i|,
\qquad
1\leq q<v\leq l,
\qquad
n=\sum_{i=1}^{2^{l-1}-1} x_i.
]
With the help of Theorem 1 the main problem is reformulated as a particular problem of integer linear programming:
Estimating the sum of the unknowns, find all solutions of the following system of (\frac12 l(l-1)) linear equations with (2^{l-1}-1) integer nonnegative unknowns:
[
\sum_{i=1}^{2^{l-1}-1} x_i |p_{l-q}^i-p_{l-v}^i|=\rho_{qv},
\qquad
1\leq q<v\leq l.
\tag{2}
]
We shall say that in a metric space (A) the (f)-gon inequality is satisfied if, for any two nonintersecting subsets of the space (A): (\delta'={\delta'1,\ldots,\delta'}) and (\delta''={\delta''1,\ldots,\delta''}), of cardinalities ([f/2]) and (f-[f/2]), respectively, the inequality
[
K_{\delta',\delta''}
=
\sum_{\substack{1\leq q\leq [f/2]\ 1\leq v\leq f-[f/2]}}
\rho_{\delta'q\delta''_v}
-
\sum
\rho_{\delta'q\delta'_v}
-
\sum
\rho_{\delta''_q\delta''_v}
\geq 0
]
holds.
Theorem 2. If the given distance matrix (\overline A) is realizable, then (A) satisfies the conditions: a) all elements of (\overline A) are integers; b) the semiperimeters of all triangles in (A) are integers; c) in (A) the (f)-gon inequality is satisfied for all (f) from (f=2) to (f) equal to the cardinality of (A).
Theorem 2 is proved with the help of Theorem 1.
Corollary. a)
[
K_{\delta',\delta''}=\frac{1}{f-2}\sum_{i=1}^{[f/2]}\left(K_{\delta'/\delta'i,\delta''}+K\right)
]
for even (f,\ f\geq 4);
b)
[
K_{\delta',\delta''}=\frac12\left(K_{\delta'\cup\gamma,\delta''}+K_{\delta',\delta''\cup\gamma}\right)
]
for (f) even and strictly less than the cardinality of (A); here (\gamma\in A,\ \gamma\notin\delta',\ \gamma\notin\delta'');
c)
[
K_{\delta'\cup\overline{\delta'},\,\delta''\cup\overline{\delta''}}
+
K_{\delta'\cup\overline{\delta''},\,\delta''\cup\overline{\delta'}}
=
2\left(K_{\delta'',\delta'}+K_{\delta'',\overline{\delta}}\right).
]
Equalities a) and b) show that if, in a metric space, the (f)-gon inequality holds for all odd (f), then it also holds for all even (f). But if the numbers (f_1) and (f_2) are both odd, then there exist spaces for which the (f_1)-gon inequality holds, while the (f_2)-gon inequality does not, and conversely (see (\bar A_6) and (\bar A_7) for (f_1=3) and (f_2=5)). The (f)-gon inequality for (f=3) coincides with the triangle rule, and for (f=2) means the requirement of nonnegativity of distances.
- Let ({x_1,\ldots,x_{2^l-1}}) be some solution of system (2). We shall regard as free unknowns those (x_i) for which (m_i\geqslant 3). Denote summation over all (i\in[1,2^{l-1}-1]) such that (m_i\geqslant 3) by the symbol (\overline{\sum}). System (2) can be solved with respect to the free unknowns in the following way:
Lemma. For any solution ({x_1,\ldots,x_{2^l-1}}) of system (2), the following formulas are valid:
[
\text{a) }\quad
x_{2^s+2^j}
=
{}^1!/{2}\bigl(\rho\bigr)}+\rho_{1(l-j)}-\rho_{(l-s)(l-j)
-
\overline{\sum} x_i p_s^i p_j^i,
\quad 0\leq j