Full Text
UDC 061.3(47):62-506.1 CYBERNETICS AND CONTROL THEORY
V. A. YAKUBOVICH
ON THE THEORY OF ADAPTIVE SYSTEMS
(Presented by Academician V. I. Smirnov on 25 X 1967)
In accordance with accepted terminology, we shall call a system adaptive if the law of its functioning changes depending on the experience acquired. Information is supplied to the system in some form about the “failure” or “success” of its behavior with respect to some target condition. Certain characteristics of the environment and of the system, and also, possibly, of the target condition, are unknown to the designer—they may be arbitrary within some class $\mathfrak M$. We shall call a system intelligent in the class $\mathfrak M$ if, for any target condition and any characteristics of this class, there comes a moment after which the target condition always begins to be fulfilled. Below we give an exact, formalized statement of the simplest variant of the problem of constructing, for a given class $\mathfrak M$, a system that is intelligent in this class (the “simplest robot”), and also, under a number of assumptions, a solution of this precisely posed problem. The results are illustrated by two mathematically stylized examples of the simplest systems, intelligent in the indicated, highly conditional sense. (For other formalizations and solutions of problems of constructing adaptive systems, see ($^1$).)
$1^\circ$. We shall assume that the time $t$ takes the values $t=0,1,2,\ldots$. Quantities that change (generally speaking) in time will be called variables, and quantities whose values are fixed for the given system (and, consequently, do not change in time) will be called parameters. A given set of certain elements $z$ will be denoted by $\{z\}$. The value of the variable $z$ at the moment $t$ will be denoted by $z_t$. We shall consider as given the sets $\{x\}$, $\{s\}$, $\{\sigma\}$, $\{u\}$ and as to be determined (in accordance with the conditions formulated below) the set $\{\tau\}$, whose elements are called as follows: $x$—the external coordinates of the robot, $s$—the environment, $\sigma$—the sensor, $u$—the control, $\tau$—the tactic. Let a function $\mu(x,s)$ be given, with value 0 or 1, called the signal of activation of the target condition, and also a real-valued function $F(x,s)$. By the target condition (t.c.) we shall mean the condition: if $\mu_t=\mu(x_t,s_t)=1$, then $F(x_{t+1},s_{t+1})>0$. We shall say that the t.c. is fulfilled at the moment $t+1$ if either $\mu_t=1$ and $F(x_{t+1},s_{t+1})>0$, or $\mu_t=0$.
We shall regard as given: 1) the sensory equation $\sigma_t=\sigma(x_t,s_t)$ (which determines what the robot “sees”); 2) the motor equation $x_{t+1}=X(x_t,u_t)$ (which determines the motion of the robot); 3) the equation of change of the environment $s_{t+1}=S(x_t,s_t)$. The following “brain equations” of the robot are to be determined: 4) $u_t=u(\sigma_t,\tau_t)$; 5) $\tau_{t+1}=T(\sigma_t,\sigma_{t+1},\tau_t)$. For given $x_0$, $s_0$, $\tau_0$, equations 1)—5) make it possible to find successively the values of all the indicated variables at all moments of time. In this case, for each $t=1,2,\ldots$ the t.c. will either be fulfilled or not. We shall assume that $s_0$, $x_0$, and also the functions $\mu$, $F$, $\sigma$, $X$, $S$ (but not $u$ and $T$) depend, generally speaking, on certain parameters $\xi=\|\xi_i\|$, called variable parameters, whose variation within certain prescribed limits ($\xi\in\mathfrak M$) creates a class of problems on fulfilling the t.c. If everything indicated above has been defined, then we shall say that the simplest robot is given. We shall call the simplest robot intelligent in the class of problems $\mathfrak M$ if, for any values of the variable-
parameters \(\xi \in \mathfrak M\) there will be found a moment \(t_0\) such that for all \(t \ge t_0\) the c.c. will be fulfilled and \(\tau_t=\mathrm{const}\) for \(t \ge t_0\). The brain equations must be chosen so that the robot becomes reasonable in the class of problems \(\mathfrak M\).
In the two simple but typical examples of the simplest robots given below, secondary details are omitted (so as not to encumber the exposition), and the brain equations also remain unspecified. Below, in Sec. \(4^0\), it will be shown how to construct the brain equations of these two simplest robots so that they become reasonable in the indicated classes of problems.
\(2^0\). The “grasshopper” robot (K). The external coordinates of K are \(x=\|z,\varphi\|\), where \(z\) is a complex number (\(|z|\le L\)) determining the Cartesian coordinates of K, and the “course” angle \(\varphi\), \(0\le \varphi\le 2\pi\), determines the orientation of K. The environment \(s\) is identified with the complex number \(s\) (the coordinates of the target), \(|s|\le L\). The number \(L\) is a variable parameter. We shall call the coordinate system of K the system with center at the point \(z\), rotated through the angle \(\varphi\). K sees a landmark at the origin of the fixed coordinate system and the target. More precisely, the sensors \(\sigma=\|\xi_0,\psi_0,\zeta,\psi\|\) are the following quantities, connected with the coordinates of the target and of the landmark in the coordinate system K: \(\xi_0=\delta\cdot(|z|+\nu)^{-1}\), \(\psi_0=\arg z-\varphi\), \(\zeta=\delta\cdot(|z-s|+\nu)^{-1}\), \(\psi=\arg(s-z)-\varphi\). Here \(\delta>0\), \(\nu>0\) are parameters. The motion of K is carried out as follows: K turns through the angle \(f_t\), then jumps a distance \(r_t\). Therefore the controls are \(u=\|f_t,r_t\|\), and the equations of motion have the form \(\varphi_{t+1}=\varphi_t+f_t,\ z_{t+1}=z'_{t+1}\), where \(z'_{t+1}=z_t+r_t\exp i(\varphi_t+f_t)\), provided only that \(|z_{t+1}|\le L\). If \(|z'_{t+1}|>L\) (K “wants” to jump out of the circle \(|z|\le L\)), then \(z_{t+1}\) is determined from the conditions of “sticking” to the wall \(|z|=L\) or of “reflection” (according to some law) from this wall. The c.c. activation signal: \(\mu_t=1\), if \(|z_t-s_t|\ge \varepsilon\), and \(\mu_t=0\), if \(|z_t-s_t|<\varepsilon\). The c.c. consists in the requirement to catch the target at the next moment, if it is not caught now: if \(\mu_t=1\), then \(|z_{t+1}-s_{t+1}|<\varepsilon\) (the number \(\varepsilon\) is a parameter, \(\varepsilon<L\)). Thus, K must jump into the \(\varepsilon\)-neighborhood of the point where the target will be at the next moment. The target sees the landmark and K, and its displacement depends on where it sees them. Suppose that the target does not pay attention to the orientation of K: \(s_{t+1}=S(s_t,s_t-z_t,\xi)\). Here \(\xi\in\mathfrak M\) is a variable vector parameter. Since the target sees only the landmark, and not the coordinate system associated with it, the function \(S(s,w,\xi)\) must satisfy the condition \(S(e^{i\chi}s,e^{i\chi}w,\xi)=e^{i\chi}S(s,w,\xi)\), where \(\chi\) is any real number. In addition, we shall assume that the function \(S\) and its derivatives with respect to \(\operatorname{Re}s,\operatorname{Im}s,\operatorname{Re}w,\operatorname{Im}w\) are bounded for \(|s|\le L,\ \varepsilon\le |w|\le 2L\) uniformly in \(\xi\in\mathfrak M\).
\(3^0\). The “eye—hand” robot (EH). The external coordinates of EH are a pair of complex numbers \(z,z'\), connected by the relations \(|z|=l,\ |z'-z|=l'\), where \(l>0,\ l'>0\) are variable parameters. (The vector \(z\) is the “shoulder,” the vector \((z'-z)\) is the “forearm,” the point \(z'\) is the “end of the hand.”) The environment is identified with a pair of complex numbers \(s',s''\), where \(|s'-s''|=\delta,\ |s'|\le l+l'-\delta_0\). Here \(\delta_0\) is a parameter, \(\delta\) is a variable parameter. The numbers \(s',s''\) determine the ends of the segment that EH “sees.” Let the “eye” be located at the point \(a\). The complex number \(a\) is a parameter. EH “sees” the points \(z',s',s''\). More precisely, let the sensors be \(\sigma=\|\zeta,\eta,\zeta',\eta',\zeta'',\eta''\|\), where \(\zeta=\delta_1[(|z'-a|+\nu]^{-1}\), \(\eta=\arg(z'-a)\); \(\zeta',\eta',\zeta'',\eta''\) are determined analogously from \(s',s''\), and \(\delta_1>0,\ \nu>0\) are parameters. Let \(\varphi\) be the angle formed by the shoulder with a fixed direction, for example, \(\varphi=\arg z\), and let \(\psi\) be the angle between the continuation of the shoulder and the forearm. The motion of EH is carried out by setting prescribed values of \(\varphi\) and \(\psi\). Consequently \(\varphi,\psi\) are the controls, and \(z_{t+1}=le^{i\varphi_t},\ z'_{t+1}=le^{i\varphi_t}+l'e^{i(\varphi_t+\psi_t)}\) are the motor equations. The c.c. activation signal: \(\mu_t=1\), if \(|z'_t-s'_t|\ge\varepsilon\), and \(\mu_t=0\), if \(|z'_t-s'_t|<\varepsilon\). The c.c.: if \(\mu_t=1\), then \(|z'_{t+1}-s'_{t+1}|<\varepsilon\). Thus, the robot EH must make the end \(z'\) of the “hand” follow the point \(s'\), anticipating its position at the next moment. We shall assume that
the “goal” \(s_t=\|s_t',s_t''\|\) “sees” only the end of the arm \(z_t'\), i.e., the equation of change of the environment has the form \(s_{t+1}=S(s_t,z_t',\xi)\), where \(S\) has derivatives with respect to \(\operatorname{Re}s_t'\), \(\operatorname{Im}s_t'\), \(\ldots\), \(\operatorname{Im}z_t'\), and, together with these derivatives, is uniformly bounded in \(\xi\in\mathfrak{M}\) when \(s_t'\), \(s_t''\), \(z_t'\) vary within the limits indicated above.
\(4^\circ\). Returning to the general case, let us formulate assumptions under which the posed problem will be solved. We shall denote by \(R_n\) the Euclidean space of dimension \(n\). Suppose that \(\{\sigma\}\) is a compact set in some \(R_n\), \(\sigma=\|\sigma_j\|_{j=1}^n\). We shall regard the following four conditions as fulfilled:
(I). It is possible to introduce a new control \(v\), where \(\{v\}\) is a bounded subset of some \(R_q\), \(v=\|v_j\|_{j=1}^q\), so that \(u=u(v)\) is a single-valued function, and so that the c.g. at the moment \(t+1\) is certainly accomplished if \(k\) inequalities are fulfilled
\[ |(c_j,v_t)-\varphi_t^j|<\varepsilon_j,\quad j=1,\ldots,k, \tag{1} \]
where \(\varepsilon_j\) are parameters, \(c_1,\ldots,c_k\) are linearly independent known vectors from \(R_q\), and \(\varphi_t^j=\varphi_j(x_t,x_{t+1},s_t,s_{t+1},\xi)\) are certain functions of the indicated arguments.
(II). There exists a function \(v=V^i(\sigma,\xi)\), called the ideal control, such that for any \(x_t,s_t\) and \(\xi\in\mathfrak{M}\), when \(v_t=V^i(\sigma,\xi)\), (1) is fulfilled with \(\varepsilon_j\) replaced by some \(\varepsilon_j^*<\varepsilon_j\). Here in (1) \(\sigma_t,x_{t+1},s_{t+1}\) are determined according to the natural chain of relations
\[
\sigma_t=\sigma(x_t,s_t,\xi),\quad u_t=u(v_t),\quad x_{t+1}=X(x_t,u_t,\xi),\quad s_{t+1}=S(x_t,s_t,\xi).
\]
(III). Whatever the control \(v_t\), the value of \(\varphi_t^j\) can be expressed through \(v_t,\sigma_t,\sigma_{t+1}\), i.e. \(\varphi_t^j=\Phi_j(v_t,\sigma_t,\sigma_{t+1})\), where \(\Phi_j\) are certain functions.
(IV). For all \(\xi\in\mathfrak{M}\), \(\sigma\in\{\sigma\}\), there exist \(\partial V^i/\partial\sigma_j\) and \(|V^i(\sigma,\xi)|\leq\mathrm{const}\), \(|\partial V^i/\partial\sigma_j|\leq\mathrm{const}\).*
It is easy to show that robot K satisfies conditions (I)—(IV). The new controls \(X_t,Y_t\) are introduced by the formula \(X_t+iY_t=r_t\exp(i f_t)\), and equations (1) have the form
\[
|X_t-\operatorname{Re}\tilde{s}_{t+1}|<\varepsilon/\sqrt{2},\quad
|Y_t-\operatorname{Im}\tilde{s}_{t+1}|<\varepsilon/\sqrt{2},
\]
where \(\tilde{s}_t=(s_t-z_t)\exp(-i\varphi_t)\). Robot GR satisfies conditions (I)—(IV) if the parameters \(a,\delta,\nu,l,l'\) are not variable. The new controls \(v_t,w_t\) are introduced by the formula \(v_t+iw_t=z_{t+1}'\). Equations (1) are the equations
\[
|v_t-\operatorname{Re}s_{t+1}'|<\varepsilon/\sqrt{2},\quad
|w_t-\operatorname{Im}s_{t+1}'|<\varepsilon/\sqrt{2}.
\]
In the considerably more interesting case when \(a,\delta,\nu,l,l'\) are variable parameters, the equations of the brain of the rational GR can also be constructed according to the scheme set forth below, but with a number of complications. Let us return to the general case.
Theorem 1. If conditions (I)—(IV) are fulfilled, equations of the brain can be constructed so that the resulting simplest robot becomes intelligent in the class of problems \(\mathfrak{M}\).
Proof. Let \(k=1\) and \(c_1=\|\gamma^h\|_{h=1}^q\). Choose \(\rho>0\) so that
\[
\varepsilon_1^{**}=\varepsilon_1^*+\rho\left(|\gamma^1|+\cdots+|\gamma^q|\right)<\varepsilon_1.
\]
Using (IV), we obtain that there exists a certain number \(N\) of real continuous functions \(v_j(\sigma)\) such that, for any \(\sigma\in\{\sigma\}\), \(\xi\in\mathfrak{M}\), \(h=1,\ldots,q\), the following is fulfilled:
\[
\left|V_h^i(\sigma,\xi)-\sum_{j=1}^N \tau^{jh}(\xi)v_j(\sigma)\right|<\rho,
\]
where \(\tau^{jh}(\xi)\) are certain numbers. Take
\[
\{\tau\}=R_{qN+1},\quad \tau=\|\tau^{jh},\chi\|,\quad j=1,\ldots,N;\ h=1,\ldots,q,
\]
and define the first
* Let us explain these assumptions. Condition (I) means that, first, the c.g. requires that “something differ from something by sufficiently little,” and, second, that this “something” depend linearly on the new controls. Condition (II), roughly speaking, is equivalent to the fundamental possibility of solving the problem. Condition (III) requires that the error at the moment \(t\) be measurable from the data at the moments \(t\) and \(t+1\). Condition (IV) is practically nonrestrictive.
the brain equation by the relations \(u_t=u(v_t)\), \(v_t^h=\tau_t^{1h}v_1(\sigma_t)+\ldots+\tau_t^{Nh}v_N(\sigma_t)\). Inequality (1) for \(\tau^{jh}=\tau_t^{jh}\) is rewritten in the form
\[ \left|\sum_{j=1}^{N}\sum_{h=1}^{q}\tau^j\gamma^j v(\sigma_t)-\varphi_t^1\right|<\varepsilon_1 . \tag{2} \]
We have arrived at the situation considered in (2). For any \(x_0,s_0,\tau_0\), \(\xi\in\mathfrak M\), and any equation \(\tau_{t+1}=T(\sigma_t,\sigma_{t+1},\tau_t)\), the values of all variables are determined successively for all \(t\). In this case, for \(\tau^{jh}\) one obtains an infinite sequence of inequalities (2). However, for a certain choice of the function \(T\) (“a finitely convergent algorithm” (2)) there is a \(t_0\) such that, for \(\tau^{jh}=\tau_t^{jh}\), all inequalities (2) with \(t\ge t_0\) will be satisfied. Moreover, \(\tau_t^{jh}=\mathrm{const}\) for \(t\ge t_0\). Then, according to (1), for \(t\ge t_0\) the c.y. will be fulfilled, i.e., the robot will be reasonable in the class of tasks \(\mathfrak M\).
Let us show that, in order to determine the required function \(T\), one may use Theorem 5 (2). The coefficients of \(\tau^{jh}\) in inequalities (2) are bounded uniformly with respect to \(\sigma_t\in\{\sigma\}\). Using (II), and taking into account the relation for \(\rho\), we obtain that for any \(\xi\in\mathfrak M\), \(s_t,x_t,\sigma_t=\sigma(s_t,x_t)\), upon substituting \(\tau^{jh}=\tau^{jh}(\xi)\), the left-hand side of inequality (2) does not exceed the value \(\varepsilon_1^{**}<\varepsilon_1\). Thus, the assumption (2) on the existence of a solution of the “strengthened” system of inequalities (2) is fulfilled, and Theorem 5 (2) may be applied. Take as \(x_t\) and \(k(t)\) from (2) \(x_t=\|\tau_t^{jh}\|\), \(k(t)=x_t\). Theorem 5 (2) gives the value \(\tau_{t+1}\). The expression for \(\tau_{t+1}\), supplied by Theorem 5 (2), depends on \(\sigma_t\), and also on the value \(\eta_t\), equal to the expression under the modulus sign in (2). We also have \(\eta_t=(c_t,v_t)-\varphi_t^1\).
According to (III), \(\eta_t\), and consequently also \(\tau_{t+1}\), is expressed through \(\sigma_t,\sigma_{t+1},\tau_t\), as was required to prove. The case \(k>1\) is considered analogously. Instead of (2), there are (for each \(t\)) \(k\) inequalities, and Theorem 5 (2) should be applied successively to these \(k\) inequalities.*
\(5^\circ\). In the simpler situation when \(k=1\), \(\varepsilon_1^*<\varepsilon_1/2\), one may apply Theorem 3 (2) (the number \(\rho\) must be chosen so that \(\varepsilon_1^{**}<\varepsilon_1/4\)). In this case the brain equations have a sufficiently simple form.
Theorem 2. Suppose that conditions (I)—(IV) are satisfied, \(k=1\), and in (II) \(\varepsilon_1^*<\varepsilon_1/2\). Suppose that the number \(N\) and the \(N\) functions \(v_j(\sigma)\) are determined as indicated above. Put \(\{\tau\}=R_N^q\), \(\tau=\|\tau^{jh}\|\). Define the first brain equation by the relations \(u_t=u(u_l)\), \(v_t=\|v_t^h\|\), \(v_t^h=\tau_t^{1h}v_1(\sigma_t)+\ldots+\tau_t^{Nh}v_N(\sigma_t)\), and the second by the relations \(\tau_{t+1}^{jh}=\tau_t^{jh}\), if \(|\eta_t|<\varepsilon_1\), \(\tau_{t+1}^{jh}=\tau_t^{jh}-\eta_t \xi_t^{-1}\gamma^h v_j(\sigma_t)\), if \(|\eta_t|\ge \varepsilon_1\), where \(\eta_t=(c_1,v_t)-\Phi_1(v_t,\sigma_t,\sigma_{t+1})\), \(\xi_t=(c_1,c_1)[v_1(\sigma_t)^2+\ldots+v_N(\sigma_t)^2]\). Then the simplest robot will be reasonable in the class of tasks \(\mathfrak M\).
Leningrad State University
named after A. A. Zhdanov
Received
18 X 1967
CITED LITERATURE
- Ya. Z. Tsypkin, Avtomatika i telemekhanika, 27, No. 1 (1966).
- V. A. Yakubovich, DAN, 166, No. 6 (1966).**
- V. A. Yakubovich, in: Self-adjusting Systems, “Nauka,” 1967.
* The proof of Theorem 5 (2) is given in (3). (In (3) it is assumed that the infinite system of inequalities (2) is given in advance, but the proof of (3) remains valid also in the case under consideration.)
* Correction.* The following corrections must be made to paper (2). On p. 1308, line 2, \(V(x)\) is printed; it should read \(V(x)\ge 0\). On p. 1311, line 15, \(B_i\) is printed; it should read \(R_i\). On p. 1311, line 17, \(|(x,c_j)+\gamma_j|>\varepsilon\) is printed; it should read \(|(x,c_j)+\gamma_j|\le\varepsilon\). On p. 1311, line 20, \(\rho_{k(j)}\) is printed; it should read \(\rho_{k(j)}\operatorname{sign}\eta_j\). On p. 1311, line 31, “...zition ... will be for” is printed; it should read “...zition of the indicated algorithms \(x_{j+1}=x_j\), if \(\varphi(x_j,a_j)>0\); \(x_{j+1}=f_j[x_j,c_j(x_j,a_j),\gamma_j(x_j,a_j)]\), if \((x_j,a_j)\le 0\), will be for.”