Abstract
Full Text
MATHEMATICS
Yu. I. ZHURAVLEV
ON OPTIMAL SELECTION ALGORITHMS
(Presented by Academician S. L. Sobolev on 24 III 1958)
Suppose a table is given, filled with sets of symbols of some alphabet. This may be either a dictionary, or a table of values of a function, or something else. Here the following problem arises: to find, in the given table, the place where a certain set is written. To this problem is reduced the problem of searching for a word in a dictionary, the problem of finding the value of a function from the value of the argument, and so on. There is a trivial solution of the problem posed: one must inspect the sets from the table one after another and compare them successively with the given set until a match occurs. Such an algorithm is low-productivity, i.e., it requires much time for its execution. This is justified for a one-time selection and is extremely undesirable for repeated selections. In the latter case one can indicate a more rational method of search, based on a preliminary study of the given table. The time spent on studying the table and composing the algorithm is fully repaid in repeated selections. We encounter an analogous process in the frequent use of some dictionary—we gradually accumulate information about the arrangement of words and, as a result, accelerate the search process. Algorithms of this kind are very valuable in solving problems on high-speed machines, since they increase their productivity. In the present note the problem of selection from a table is formulated precisely, its solution is given for a certain class of tables, and it is shown that this solution is close to optimal.
We shall consider words of length \(n\), i.e., tuples of \(n\) symbols, each of which is either 1 or 0, and matrices \(T_{n\varphi}\), consisting of \(n\) columns and \(\varphi\) rows.
We shall call a matrix \(T_{n\varphi}\) without repetitions if all its rows are distinct. In what follows we shall consider only matrices without repetitions.
Problem U. Let a matrix \(T_{n\varphi}\) be given. For any word \(s\) that coincides with one of the rows of the matrix \(T_{n\varphi}\), find the number of the row with which this word coincides.
We denote the \(i\)-th row of the matrix \(T_{n\varphi}\) by \(s_i\).
Definition. By a submatrix \(T_{n\varphi'}\) of the matrix \(T_{n\varphi}\) is meant a matrix obtained if certain rows are crossed out in \(T_{n\varphi}\). The rows of the submatrix are assigned the same number that the corresponding row had in \(T_{n\varphi}\).
Definition. A submatrix \(T_{n\varphi''}\) is contained in the submatrix \(T_{n\varphi'}\) (notation \(T_{n\varphi''} \subseteq T_{n\varphi'}\)) if each row in \(T_{n\varphi''}\) is simultaneously a row in \(T_{n\varphi'}\) and its numbers in \(T_{n\varphi''}\) and \(T_{n\varphi'}\) coincide.
Let \(T_{n\varphi'}\) be an arbitrary submatrix of the matrix \(T_{n\varphi}\), and let \(s\) be a word such that \(s \subseteq T_{n\varphi'}\). In solving problem U we shall use operators \(A_i(s,T_{n\varphi'})\) \((i=0,1,2,\ldots,m)\), transforming the pair \(s,T_{n\varphi'}\) (here \(s \subseteq T_{n\varphi'} \subseteq T_{n\varphi}\)) into a submatrix \(T_{n\varphi''}\) such that \(s \subseteq T_{n\varphi''} \subseteq T_{n\varphi'}\). Finite sequences
\[
C_s = A_{j_1}, A_{j_2}, \ldots, A_{j_k},
\]
where \(0 \leq j_1, j_2, \ldots, j_k \leq m\), we shall call operator chains. The chain \(C_s\) realizes the word \(s\) in the matrix
\(T_{n\varphi}\), if \(A_{j_1}(s,T_{n\varphi})=T_{n\varphi_1},\ldots,A_{j_2}(s,T_{n\varphi_1})=T_{n\varphi_2},\ldots,A_{j_k}(s,T_{n\varphi_{k-1}})=T_{n\varphi_k}=s\), where \(0\leq j_1,\ldots,j_k\leq m\).
Definition. We shall say that an algorithm for solving problem \(U\) is specified on the matrix \(T_{n\varphi}\) if to each word \(s\subseteq T_{n\varphi}\) there is assigned one and only one operator chain \(\overline{C}_s\) realizing the word \(s\).
To each execution of an operator \(A_{j_k}\) on an arbitrary submatrix \(T_{n\varphi'}\) we assign a positive number \(p\), which we shall call the weight. We shall assume that the weight depends both on the operator and on the submatrix \(T_{n\varphi'}\), i.e. \(p=p(A_{j_k},T_{n\varphi'})\) \((j_k=0,1,2,\ldots,m)\). Suppose that on the matrix \(T_{n\varphi}=T^0_{n\varphi}\) an algorithm \(K\) for solving problem \(U\) is given.
Select an arbitrary row \(s\subseteq T_{n\varphi}\) and the corresponding operator chain \(C_s=A_{j_1},A_{j_2},\ldots,A_{j_k}\). Let
\(A_{j_1}(s,T^0_{n\varphi})=T^1_{n\varphi}\), \(A_{j_2}(s,T^1_{n\varphi})=\)
\[
=T^2_{n\varphi},\ldots,A_{j_k}(s,T^{k-1}_{n\varphi})=T^k_{n\varphi}=s.
\]
Let
\[
\sum_{m=1}^{k} p(A_{j_m},T^{m-1}_{n\varphi})=P_s(K).
\]
Definition. The weight of the algorithm \(K\) on the matrix \(T_{n\varphi}\) is
\[
P_{T_{n\varphi}}(K)=\max_{s\subseteq T_{n\varphi}} P_s(K).
\]
For describing the algorithm \(K\) a certain amount of information is necessary. The number of binary symbols required for writing this information will be characterized by the concept of the instruction volume. To each pair \((A_j,T_{n\varphi'})\) we assign a positive number \(\beta(A_j,T_{n\varphi'})\). In all operator chains of the algorithm \(K\), select the operator \(A_{j_k}\) and the submatrices \(T_{n\varphi_1},T_{n\varphi_2},\ldots,T_{n\varphi_r}\) to which it is applied. Identical submatrices are written down once. Let
\[
\beta(A_j)=\sum_{i=1}^{r}\beta(A_j,T_{n\varphi_i}).
\]
The instruction volume of the algorithm \(K\) on the matrix \(T_{n\varphi}\) is
\[
\beta(K)=\sum_{j=0}^{m}\beta(A_j).
\]
We define the system of operators \(A_0,A_1,\ldots,A_n\). The operator \(A_j\) \((j=1,2,\ldots,n)\) transforms a submatrix \(T_{n\varphi'}\) and a word \(s\subseteq T_{n\varphi'}\) into a submatrix \(T_{n\varphi''}\), \(s\subseteq T_{n\varphi''}\subseteq T_{n\varphi'}\), in the following way. The \(j\)-th letter \(\alpha_j\) of the word \(s\) is selected. It is compared with all elements \(a_{kj}\) of the matrix \(T_{n\varphi'}\). Those rows for which \(a_{kj}=\alpha_j\), and only those, form the matrix \(T_{n\varphi''}\).
The numbers \(p(A_j,T_{n\varphi'})\) and \(\beta(A_j,T_{n\varphi'})\) are defined as follows:
\(p(A_j,T_{n\varphi'})=p_1(\varphi')\), \(j=1,2,\ldots,n\), where \(p_1(x)\) is an arbitrary function satisfying the conditions: 1) \(p_1(x)\) is defined for all real nonnegative values of \(x\); 2) \(p_1(x)\geq0\) for \(x\geq0\); 3) \(p_1(x)\) does not decrease for \(x\geq0\). The number \(\beta(A_j,T_{n\varphi'})=\log_2 n\), \(j=1,2,\ldots,n\), \(T_{n\varphi'}\subseteq T_{n\varphi}\).
The operator \(A_0\) is defined as follows: \(A_0(s,T_{n\varphi'})=s\) for any submatrix \(T_{n\varphi'}\) and any word \(s\subseteq T_{n\varphi'}\).
Definition. The distance \(\rho(s_i,s_j)\) between the words \(s_i=\alpha_1,\alpha_2,\ldots,\alpha_n\) and \(s_j=\beta_1,\beta_2,\ldots,\beta_n\), \(\alpha_i=\{0,1\}\), \(\beta_i=\{0,1\}\), is called
\[ \rho(s_i,s_j)=\sum_{i=1}^{n}|\alpha_i-\beta_i|. \]
Let \(s\) be an arbitrary word of length \(n\), and let \(s_1,s_2,\ldots,s_{\varphi'}\) be the words coinciding with the rows of the matrix \(T_{n\varphi'}\). Put
\[
\sum_{i=1}^{\varphi'}\rho(s,s_i)=\lambda_s.
\]
We shall call the number
\[
\theta=\min_s \lambda_s
\]
(the minimum is taken over all words of length \(n\)) the type of the submatrix \(T_{n\varphi'}\). We introduce the weight and the instruction volume as follows:
\(p_2(A_0,T_{n\varphi'})=p_2(\theta)\), where \(\theta\) is the type of the matrix \(T_{n\varphi'}\); \(p_2(x)\) is an arbitrary function satisfying the following conditions:
1) \(p_2(x)\) is defined for all real nonnegative values of \(x\);
2) \(p_2(x)\geqslant 0\) for \(x\geqslant 0\); 3) \(p_2(x)\) does not decrease for \(x\geqslant 0\). The number \(\beta(A_0,T_{n\varphi})=\theta\log_2 n+n\), where \(\theta\) is the type of the submatrix \(T_{n\varphi}\). The type \(\theta\) characterizes, in a certain sense, the deviation from some word \(s\). To describe the algorithm it is enough to specify this word \(s\), which takes \(n\) symbols, and the places where the words of the submatrix \(T_{n\varphi}\) differ from the word \(s\); for this, generally speaking, \(\theta\log_2 n\) symbols are necessary.
Let there be functions \(\varphi(x)\), \(\theta(x)\), \(\beta(x)\), \(P(x)\). In what follows we shall assume that \(\varphi=\varphi(n)\), \(\theta=\theta(n)\), \(\beta(K)=\beta(n)\), \(P_{T_{n\varphi}}(K)=P(n)\) satisfy the following conditions: 1) \(\varphi(x)>0\), \(\theta(x)>0\), \(\beta(x)>0\), \(P(x)>0\) for \(x>0\); 2) \(\varphi(x)\) increases monotonically for \(x\geqslant 0\); 3) \(\lim_{x\to\infty}\varphi(x)=\infty\).
Definition. We shall call an algorithm \(K\) over a matrix regular if its operator chains are composed of the operators \(A_0,A_1,\ldots,A_n\) and the instruction volume \(\beta(K)\) satisfies the condition:
\[
\lim_{n\to\infty}[\beta(K):(n\varphi(n))]=0.
\]
Theorem 1. For every infinite matrix \(T_{n\varphi}\) there exists a regular algorithm.
Let \(\{K_R(T_{n\varphi})\}\) be the set of regular algorithms over the matrix \(T_{n\varphi}\), and \(\{T_{n\varphi(n)}\}\) the set of infinite matrices \(T_{n\varphi(n)}\). Introduce the function
\[
\mathscr{T}(n)=
\max_{T_{n\varphi}\in\{T_{n\varphi}\}}
\left\{\min_{K_R\in\{K_R(T_{n\varphi})\}} P_{T_{n\varphi}}(K)\right\}.
\]
We shall study the behavior of the function \(\mathscr{T}(n)\) as \(n\to\infty\).
Consider arbitrary functions \(\chi(x)\) and \(\psi(x)\), defined for all real \(x\geqslant 0\) and satisfying the following conditions:
1) \(\lim_{x\to\infty}\chi(x)=\infty\);
2) \(\lim_{x\to 0}[\chi(x)\log_2 x:\psi(x)]=0\);
3) \(\lim_{x\to\infty}(\psi(x):\varphi(x))=0\).
Theorem 2. For arbitrary functions \(\chi(x)\) and \(\psi(x)\) satisfying conditions 1)—3), the inequality holds
\[
\mathscr{T}(n)<
\frac{1}{\chi(n)}
\int_{0}^{\varphi(n)+\chi(n)} p_1(x)\,dx
+
\max\left\{
\int_{0}^{\psi(n)+1} p_1(x)\,dx,\,
p_2(n\chi(n))
\right\}.
\]
Suppose also that the following conditions are fulfilled:
4) for every function \(\chi(x)\) such that \(\lim_{x\to\infty}(\chi(x):\varphi(x))=0\), the relation
\[
\lim_{x\to\infty}
\left[
\int_{\varphi(x)}^{\varphi(x)+\chi(x)} p_1(x)\,dx:
\int_{0}^{\varphi(x)} p_1(x)\,dx
\right]=0;
\]
5) \(p_2(\theta)=p_3(n)\dfrac{\theta}{n}\), where \(p_3(x)\) is defined and positive for all real \(x\geqslant 0\).
Consider the function
\[
\chi_1(x)=
\left(
\frac{\displaystyle\int_{0}^{\varphi(x)} p_1(x)\,dx}{p_3(x)}
\right)^{1/2}.
\]
We shall assume that there exists a function \(\psi(x)\) such that \(\chi_1(x)\) and \(\psi(x)\) satisfy conditions 1)—3) and
\[
6)\quad
\int_{0}^{\psi+1} p_1(x)\,dx
\leqslant
\left[
\int_{0}^{\varphi(x)} p_1(x)\,dx\cdot p_3(n)
\right]^{1/2}.
\]
Theorem 3. If conditions 4)—6) are fulfilled, then
\[
\mathscr{T}(n)<
2\left[
\int_{0}^{\varphi(n)} p_1(x)\,dx\cdot p_3(n)
\right]^{1/2}
(1+\varepsilon),
\qquad
\varepsilon\to 0
\quad\text{as } n\to\infty.
\]
Corollary. If \(p_1(\varphi')=\varphi'\), \(p_2(\theta)=\theta\), and \(n^{1/2-\eta}\leqslant \psi \leqslant n^{1/2+\eta}\) \((0<\eta<1/2)\), then
\[ \mathfrak{F}(n)<\sqrt{2n\varphi(n)}\,(1+\varepsilon'),\qquad \varepsilon'\to 0 \quad \text{as } n\to\infty . \]
If to conditions 4)—6) one adds the condition:
7) \(\varphi(x)\leqslant Cx\),
then the following holds:
Theorem 4.
\[ \mathfrak{F}(n)> \left[ \int_{0}^{\varphi(n)} p_1(x)\,dx\cdot p_3(n) \right]^{1/2} (1-\varepsilon''),\qquad \varepsilon''\to 0 \quad \text{as } n\to\infty . \]
Introduce the function
\[
\mathfrak{F}_{T_{n\varphi}}(n)=
\min_{K_R\in \{K_R(T_{n\varphi})\}}
\mathfrak{F}_{T_{n\varphi}}(K_R).
\]
Let \(f_1(n,\varphi(n))\) be the number of all nonrepeating matrices \(T_{n\varphi(n)}\), and \(f_2(n,\varphi(n))\) the number of all nonrepeating matrices \(T_{n\varphi}\) for which
\[
\mathfrak{F}_{T_{n\varphi}}(n)>2p_1(\varphi)\log_2\varphi\,(1+\varepsilon_1),
\qquad \varepsilon_1\to 0
\]
as \(n\to\infty\), and \(\varepsilon_1 2p(\varphi)\log_2\varphi\to\infty\) as \(n\to\infty\).
Theorem 5. If
\[
\lim_{n\to\infty}\frac{\varphi(n)}{n/2}=0,
\]
then
\[
\lim_{n\to\infty}\frac{f_2(n,\varphi(n))}{f_1(n,\varphi(n))}=0.
\]
Moscow State University
named after M. V. Lomonosov
Received
17 III 1958