Full Text
UDC 519.217
MATHEMATICS
A. L. TOOM
ON A FAMILY OF HOMOGENEOUS NETWORKS OF FORMAL NEURONS
(Presented by Academician A. N. Kolmogorov on 11 III 1968)
I. Introduction. This article studies Markov chains describing the behavior of certain networks of formal neurons possessing the property of spontaneous activity. The problem of studying these networks was proposed in (¹). We shall give the definitions needed for us. We shall describe a family of Markov chains, which we shall denote by \(M_n\) (\(n\) is a natural number, or the index \(n\) is replaced by \(\infty\)). The state of such a Markov chain is given by a set of values of variables \(a_i\), which may be equal to 0 or 1. The index \(i\) ranges over the values:
\[ i=1,2,\ldots,n \quad \text{for } n \text{ natural,} \]
\[ i=0,\pm1,\pm2,\ldots, \quad \text{if } n=\infty. \]
Each variable \(a_i\) depends on time \(t=0,1,\ldots\); the state of \(a_i\) at time \(t\) will be denoted by \(a_i^t\). The transition from \(t\) to \(t+1\) is defined in the same way for all chains. For \(t>0\) variables \(b_i^t\) are introduced, interpreted in (¹) as spontaneous self-excitation. Each of them is equal to one with probability \(\theta\), independently of the other \(b_i^t\). We set, by definition, for \(t>0\):
\[ a_i^t = \begin{cases} 1, & \text{if at least one of the following two events has occurred:}\\ & a_{i-1}^{t-1}=a_i^{t-1}=1 \text{ or } b_i^t=1,\\ & \text{where, for natural } n,\ a_0 \equiv a_n;\\ 0, & \text{otherwise.} \end{cases} \tag{1} \]
We shall consider the following initial conditions:
\[ \text{at } t=0 \text{ all } a_i^0=0. \tag{2} \]
Each Markov chain \(M_n\) (\(n\) finite) is, obviously, finite and has \(2^n\) states, moreover the state in which \(a_1=a_2=\cdots=a_n=1\) is absorbing. The set of states of the chain \(M_\infty\) has the cardinality of the continuum; by the transition rules (1) and the initial conditions a measure is defined on this set for each \(t>0\), i.e., for any finite set \(m\) of values of the index \(i\) and any set of constants \(c_i\) (\(i\in m\)), equal to zero or one, the probability is determined of the event consisting in the fact that \(a_i^t=c_i\) for all \(i\in m\).
Let us denote the probability that in the chain \(M_n\)
\[ a_i^t=a_{i+1}^t=\cdots=a_{i+r-1}^t=1 \]
by \(P_{r,n}^t(\theta)\). It is easy to show that \(P_{r,n}^t(\theta)\) increases monotonically with \(\theta\). For the initial conditions under consideration it was shown in (¹) that there exists a limit, denoted by
\[ \lim_{t\to\infty} P_{r,\infty}^t(\theta)=P_{r,\infty}^{\infty}(\theta). \]
In paper (¹) lower estimates were obtained for the probabilities \(P_{r,n}^t(\theta)\), from which it follows that for sufficiently large \(\theta\) (for example, \(\theta \geq 0.37\)):
\[ \text{a) if } n=\infty \text{ and for any } r,\quad P_{r,\infty}^{\infty}=1; \tag{3} \]
b) for natural \(n\), the mathematical expectation \(T_n(\theta)\) of the moment of time \(t\) when, for the first time,
\(a_1^t=a_2^t=\cdots=a_n^t=1\), does not exceed
\[ \operatorname{const}\cdot \log n. \tag{4} \]
On the basis of computer simulation of these networks, the authors of \((^1)\) suppose that assertions (3), (4) hold only for values \(\theta>\theta_0>0\) \((\theta_0\approx 0.3)\), while for \(\theta<\theta_0\) the following holds:
a) for \(n=\infty\),
\[ P_{r,\infty}^{\infty}(\theta)<1; \tag{5} \]
b) for natural \(n\),
\[ T_n(\theta)\ge C^n\bigl(C=C(\theta)>1\bigr). \tag{6} \]
This question for \(n=\infty\) was investigated in \((^2)\), where assertion (5) was proved for sufficiently small values \(\theta>0\) (an estimate for \(\theta\) is not given in that paper). In the present article upper estimates will be obtained for the probabilities \(P_{r,n}^t(\theta)\), from which assertions (5), (6) follow for sufficiently small \(\theta\) (for example, \(\theta\le 0.07\)).
Fig. 1
More precisely, assertion (5) is proved for \(\theta<1/C\), where the constant \(C\) has a simple geometric meaning:
\[ C=\lim_{k\to\infty}\sqrt[k]{N^k}; \tag{7} \]
\(N^k\) is the number of ways in which, in Fig. 1, one can go from \(T_1T_1\) to \(T_2T_2\), moving only in the direction of the arrows, not passing through any point more than once and making exactly \(k\) steps to the right, with the steps to the left upward and to the left downward not allowed to follow one another consecutively. The existence of the limit (7) follows from the inequality \(N^{k+l}\le N^kN^l\).
The decisive role in these proofs is played by the interpretation of the probabilities \(P_{r,n}^t(\theta)\) as probabilities that certain contact circuits of unreliable elements conduct, similar to those considered in article \((^3)\). Here the fact that the function \(P_{r,\infty}^{\infty}(\theta)\) is less than 1 for \(\theta<\theta_0\) (for small \(\theta\) it behaves as \(\theta^r\)) and is equal to one for \(\theta>\theta_0\) (see the graph of \(P_{1,\infty}^{\infty}\) in \((^1)\)) is represented as a consequence of these circuits possessing properties similar to the reliability property described in \((^3)\), their width as \(t\to\infty\) tending to \(\infty\), while their length remains equal to \(r\).
II. As in \((^3)\), by a contact circuit is meant a graph whose edges can conduct current. We shall, however, regard the edges as one-sided, i.e., able to conduct in one direction and not conduct in the other.
A circuit conducts from pole \(M\) to pole \(N\) if there exists a conducting path from \(M\) to \(N\), i.e., a sequence of edges \(MA_1, A_1A_2,\ldots,A_nN\), with \(MA_1\) conducting from \(M\) to \(A_1\), etc.
A circuit is called planar if its graph is planar. For a planar circuit \(S\) with one-sided edges there is naturally defined the dual circuit \(\overline S\). Its graph is dual to the graph of the circuit \(S\); moreover, the edge \(KL\) of the circuit \(\overline S\), corresponding to the edge \(AB\) of the circuit \(S\), where \(K\) is the left and \(L\) the right region adjacent to the edge \(AB\) of the planar graph of the circuit \(S\), conducts from \(K\) to \(L\) if \(AB\) does not conduct from \(A\) to \(B\), and conducts from \(L\) to \(K\) if \(AB\) does not conduct from \(B\) to \(A\).
The circuit \(S\) does not conduct from \(M\) to \(N\) if and only if in the circuit \(\overline S\) there is a closed conducting path \(K_1K_2\ldots K_n\) (i.e., \(K_1K_2\) conducts from \(K_1\) to \(K_2\), etc.) separating \(M\) from \(N\) and encircling \(M\) clockwise.
III. Estimate of \(P_{1,\infty}^{t}(\theta)\). We now return to our Markov chains. In Fig. 2 the points \(A_j^u\) are denoted by circles. Let in each point \(A_j^u\) there be placed the variable \(a_j^0\) for \(u=0\) and \(b_j^u\) for \(u>0\). Then \(a_j^t=0\) when
if and only if, from the point \(A_i^t\) one can reach the row of points \(A_j^0\), going downward only through the points \(A_j^u\) occupied by zeros. This visual representation was noticed immediately after the writing of the paper [1]. For example, L. G. Mityushin used it to show that the probability \(P_{1,\infty}^t(\theta)\) is exactly equal to the probability that in the chain \(M_\infty\) at time \(t\) all \(a_i^t\) are equal to 1, if at time \(t=0\) one of them was equal to zero and the rest to ones.
Now let us construct the planar scheme \(S_{1,\infty}^t\), which conducts from \(M\) to \(N\) if and only if \(a_i^t=0\). The scheme \(S_{1,\infty}^t\) for \(t=3\) is shown in Fig. 2. Its vertices are shown by black dots. It is constructed as follows. Through each point \(A_j^u\), \(i-t+u\leq j\leq i\), a vertical edge is drawn, denoted below also by \(A_j^u\), which conducts upward if and only if the corresponding variable on which \(a_i^t\) depends, i.e., \(a_j^u\) for \(u=0\) and \(b_j^u\) for \(u>0\), is equal to zero. The lower end of such an edge \(A_j^u\) for \(u>0\) is connected by inclined edges with the upper ends of the edges \(A_{j-1}^{u-1}\) and \(A_j^{u-1}\). These inclined edges always conduct upward, i.e., in the direction of the arrows, and never in the reverse direction. The upper end of the edge \(A_i^t\) is taken as the pole \(N\). The pole \(M\) is connected with the lower ends of the edges \(A_j^0\) by edges that always conduct upward (i.e., from \(M\) to these ends). The conductance of these edges downward is immaterial, as is that of the edges \(A_j^u\). It will be convenient for us to assume that these edges always conduct downward.
Fig. 2
Now let us construct the scheme \(\bar S_{1,\infty}^t\), dual to \(S_{1,\infty}^t\). It is shown in Fig. 2. The edges forming the lines \(T_1T_1\) and \(T_2T_2\) always conduct in both directions and are connected with one another by an edge that always conducts and is not drawn. The remaining edges of the scheme \(\bar S_{1,\infty}^t\) are shown by dashed lines. Of these, the edges with arrows (inclined) obviously always conduct in the direction of the arrows and never in the reverse direction. The horizontal edges \(B_i^t\), corresponding to \(A_i^t\), conduct to the right with probability \(\theta\) for \(t>0\) and do not conduct for \(t=0\), while to the left they never conduct. We shall assume that the undrawn edge connecting \(T_1\) and \(T_2\) is absent, and denote \(\bar S_{1,\infty}^t\) without this edge by \(S_{1,\infty}^{\prime t}\).
The probability \(h_{T_1T_2}(\theta)\) that \(S_{1,\infty}^{\prime t}\) conducts from \(T_1\) to \(T_2\) is equal to \(P_{1,\infty}^t(\theta)\). Let us estimate it from above. This probability is less than the sum of the probabilities of all events of the form: “on the given path from \(T_1\) to \(T_2\), all edges conduct.” The probability of such an event is equal to \(\theta^k\), where \(k\) is the number of horizontal edges in this path. It is enough to consider only paths that do not pass through any vertex more than once, and in which inclined edges of different directions do not occur consecutively one after another. The number of such paths containing \(k\) horizontal edges coincides exactly with \(N^k\) in formula (7). Thus:
\[ h_{T_1T_2}(\theta)<\sum_{k=1}^{\infty}N^k\theta^k . \tag{8} \]
It is easy to estimate \(N^k\) from above by the number \(3^{3k}\), if each path is encoded by a sequence of the digits 1, 2, 3, where the digit 1 denotes an edge going
to the left downward, 2—to the right, 3—to the left upward. Then
\[ h_{T_1T_2}(\theta)<\sum_{k=1}^{\infty}N^k\theta^k \leq \sum_{k=1}^{\infty}3^{3k}\theta^k =\sum_{k=1}^{\infty}(27\theta)^k =\frac{27\theta}{1-27\theta}. \]
For \(\theta<1/54\) this sum is less than one. Thus, we have proved assertion (5) for \(\theta<1/54\).
The estimate of \(N^k\) can be somewhat improved if one observes that, in a sequence coding a path, the digits 1 and 3 cannot stand next to one another. Then we obtain \(N^k \leq \operatorname{const}\cdot(14.1)^k\).
Similarly, one can take into account still other prohibitions, for example the fact that sequences coding paths cannot contain the combinations 321 and 123, etc.
IV. Other estimates. First let us estimate \(P^t_{r,\infty}(\theta)\). To this end we construct a circuit \(S^t_{r,\infty}\), which conducts from \(M\) to \(N\) if and only if at least one of these variables is equal to zero. This circuit can be obtained from the circuit \(A^{t+r-1}_{1,\infty}\) if from this circuit one cuts off all edges \(A^j_u\) for \(u>t\) and the edges connecting them with arrows, and the upper ends of the edges \(A^t_j\) are joined to the vertex \(N\). For this circuit we construct the dual circuit \(\overline S{}^t_{r,\infty}\) with the edge removed and, analogously to the preceding, estimate from above the probability \(h_{T_1T_2}(\theta)\) that the circuit \(\overline S{}^t_{r,\infty}\) conducts:
\[ h_{T_1T_2}(\theta)\leq \sum_{k=r}^{\infty}N^k\theta^k \leq \operatorname{const}\sum_{k=r}^{\infty}(C\theta)^k =\operatorname{const}\frac{(C\theta)^r}{1-C\theta}, \]
where \(C\) has the meaning defined in (7).
For any \(\theta<1/C\) there is an \(r\) such that the expression is less than one. Hence it can be inferred that for any \(\theta<1/C\) the probability \(P^t_{1,\infty}(\theta)\) also does not tend to one, if one uses the inequality
\[ 1-P^t_{r,\infty}(\theta)\leq r[1-P^t_{1,\infty}(\theta)]. \]
For the case of natural \(n\) it is easy to construct circuits \(S^t_{r,n}\), analogous to \(S^t_{r,\infty}\), and their duals \(\overline S{}^t_{r,n}\). We restrict ourselves to the case \(r=n\). It is easy to obtain the estimate, for \(\theta<1/8\):
\[ P^t_{n,n}(\theta)\leq t\sum_{k=0}^{\infty}(2\sqrt{2})^{n+4k}\theta^{n+2k} = t\,\frac{(2\sqrt{2\theta})^n}{1-8\theta}. \]
Hence the mathematical expectation \(T_n(\theta)\) of the moment when for the first time all \(a_i^t=1\) is not less than \((1-8\theta)/2(2\sqrt{2\theta})^n\).
Moscow State University
named after M. V. Lomonosov
Received
27 XII 1967
REFERENCES
¹ O. N. Stavskaya, I. I. Pyatetskii-Shapiro, Problems of Cybernetics, 20, 1968. ² M. A. Shnirman, Problems of Cybernetics, 20, 1968. ³ E. F. Moore, R. E. Shannon. Cybernetics Collection, vol. 1, IL, 1960.