MATHEMATICS
M. ARATÓ
Submitted 1962-01-01 | RussiaRxiv: ru-196201.71148 | Translated from Russian

Full Text

MATHEMATICS

M. ARATÓ

ESTIMATION OF THE PARAMETERS OF A STATIONARY GAUSSIAN MARKOV PROCESS

(Presented by Academician A. N. Kolmogorov on 23 I 1962)

§ 1. A stationary Gaussian Markov process with continuous time is determined by three parameters \(\mu, \sigma^2, \lambda\):

\[ \mu = M\xi(t), \qquad \sigma^2 = M|\xi(t)-\mu|^2, \]

\[ M(\xi(t+\tau)-\mu)(\xi(t)-\mu)=\sigma^2 \exp(-\lambda|\tau|). \]

We shall consider the problem of estimating these parameters from a realization of the process on the interval \(0 \leqslant t \leqslant T\). Naturally, the selection of the three parameters can be made in different ways. It is essential, however, that the parameter \(a=2\lambda\sigma^2\), which has the meaning of a “diffusion coefficient,” \(M(d\xi)^2=a\,dt\), is determined exactly from a single realization (see, for example, \((^1)\)), so that in essence our problem consists in estimating two parameters. The case of known \(\mu\) was considered in \((^2)\).

Introduce the statistics

\[ m_1=\frac12[\xi(0)+\xi(T)], \qquad m_2=\frac1T\int_0^T \xi(t)\,dt, \qquad m=\frac{m_1+\varkappa m_2}{1+\varkappa}, \qquad \varkappa=\lambda T, \]

\[ s_{01}^2=\frac12\{[\xi(0)-\mu]^2+[\xi(T)-\mu]^2\}, \]

\[ s_1^2=\frac12\{[\xi(0)-m_1]^2+[\xi(T)-m_1]^2\} =\frac14[\xi(T)-\xi(0)]^2, \]

\[ s_{02}^2=\frac1T\int_0^T[\xi(t)-\mu]^2\,dt, \]

\[ s_2^2=\frac1T\int_0^T[\xi(t)-m_2]^2\,dt. \]

We shall see that the four statistics \(m_1, m_2, s_1^2, s_2^2\) form a natural sufficient set of statistics for our problem. In the case of a known value of the parameter \(\mu\), such a natural set is the pair of statistics \(s_{01}^2, s_{02}^2\). Finally, in the case of known \(\sigma^2\), a sufficient statistic is the weighted mean \(m\). This case is elementary and well known: the ratio

\[ \frac{m-\mu}{\sigma_1}, \qquad \text{where } \sigma_1^2=\frac{\sigma^2}{2(1+\varkappa)^2}(4\varkappa+1+e^{-\varkappa}), \tag{1} \]

has the \((0,1)\)-normal distribution.

The transformation

\[ t=t'T, \qquad \xi=\xi'\sqrt{Ta} \]

reduces the general problem to the case

\[ T=1,\quad a=1. \tag{2} \]

Here \(\lambda'=\lambda T=\varkappa\), i.e., up to the choice of scales, the distribution of the realizations of the process that interest us is characterized, for a known ...

by the known \(\mu\) and the only parameter \(x\). In §§ 2 and 3 we assume that reduction to case (2) has already been made and, instead of \(\lambda\), we write \(x\). In this case \(\sigma^2 = 1/2x\).

§ 2. The sample space \(R_\xi\), \(\xi(t)\), \(0 \leqslant t \leqslant 1\), may be regarded as the product of the numerical line \(\xi(0)\) and the sample space of the process \(\eta(t)=\xi(t)-\xi(0)\).

Introduce in the space \(R_\xi\) the measure \(V=L\times W\), where \(L\) is ordinary Lebesgue measure on the line, and \(W\) is the well-known conditional Wiener measure \(\left({}^{1}\right)\). Then \(\left({}^{2}\right)\) the distribution \(P\) of the process \(\xi\) in the space \(R_\xi\) is absolutely continuous with respect to \(V\) and is given by the density

\[ \frac{dP}{dV} = \sqrt{\frac{x}{\pi}} \exp\left\{ -x\left[ s_{01}^{2}-\frac{1}{2}+\frac{1}{2}x s_{02}^{2} \right] \right\}. \tag{3} \]

Formula (3) also shows that, in the case of known \(\mu\), the statistics \(s_{01}^{2}\) and \(s_{02}^{2}\) form a sufficient set of statistics. Since

\[ \log \frac{dP}{dV} = C+\frac{1}{2}\log x - \left(s_{01}^{2}-\frac{1}{2}\right)x - \frac{1}{2}x^{2}s_{02}^{2}, \]

the maximum-likelihood equation has the form

\[ \frac{1}{2x}-\left(s_{01}^{2}-\frac{1}{2}\right)-x s_{02}^{2}=0, \]

i.e.

\[ \sigma^{4}-2d\sigma^{2}-\frac{1}{2}s_{02}^{2}=0, \qquad d=\frac{1}{2}s_{01}^{2}-\frac{1}{4}. \tag{4} \]

It is easy to verify that equation (4) always has the unique positive solution

\[ \hat{\sigma}^{2}=d+\sqrt{d^{2}+s_{02}^{2}\cdot\frac{1}{2}}. \tag{5} \]

It is not hard to show that, as \(x\to\infty\), the estimate \(\hat{\sigma}^{2}\) is asymptotically normal and equivalent to the estimate \(s_{02}^{2}\).

Theorem 1. For known \(\mu\), as \(x\to\infty\), the estimate

\[ \sigma^{2}\sim s_{02}^{2} \]

is asymptotically efficient, and the distribution of the ratio

\[ \frac{s_{02}^{2}-\sigma^{2}}{s_{02}^{2}/\sqrt{x/2}} \tag{6} \]

tends to the \((0,1)\)-normal distribution.

As \(x\to 0\), the statistics \(s_{01}^{2}\) and \(s_{02}^{2}\) are asymptotically equivalent, so that one may confine oneself to either of them. The distribution of the ratio

\[ \frac{s_{02}^{2}}{\sigma^{2}} \]

as \(x\to 0\) tends to the \(\chi^{2}\) distribution with one degree of freedom:

\[ \mathbf{P}\left\{ \frac{s_{02}^{2}}{\sigma^{2}}<t^{2} \right\} \to \sqrt{\frac{2}{\pi}} \int_{0}^{t} e^{-u^{2}/2}\,du. \tag{7} \]

For intermediate values of \(x\), the question of the use of the statistics \(s_{01}^{2}\) and \(s_{02}^{2}\) is more complicated. In \(\left({}^{2}\right)\) it is not considered in detail. One may think that, without a very great loss of information, one can use the single statistic \(s_{02}^{2}\). The following hypothesis seems probable to us: for any \(\alpha\), \(0<\alpha<1\), and any \(\sigma^{2}>0\), the equation \(\mathbf{P}(\hat{\sigma}^{2}>y\mid\sigma^{2})=\alpha\) has a unique solution \(y=\varphi(\sigma^{2})\). If so, then the inverse function \(\sigma^{2}=\varphi_{\alpha}^{-1}(y)\) is also uniquely determined

and provides the confidence bound $\sigma_\alpha^2=\varphi_\alpha^{-1}(\hat\sigma)$, satisfying the condition

$$ \mathbf P\{\sigma^2 \leqslant \sigma_\alpha^2 \mid \sigma^2\}=\alpha \qquad \text{for any } \sigma^2 . \tag{8} $$

As $\varkappa\to\infty$ and as $\varkappa\to 0$, this confidence bound passes into the confidence bound provided by the asymptotic approach.

§ 3. Formula (3) is easily transformed as follows:

$$ \frac{dP}{dV} = \sqrt{\frac{\varkappa}{\pi}}\, \exp\{-\varkappa[s_1^2-\tfrac12+\tfrac12 \varkappa s_2^2+(\mu-m_1)^2+\tfrac12 \varkappa(\mu-m_2)^2]\}. \tag{9} $$

Formula (9) shows that the set of four statistics $m_1$, $m_2$, $s_1^2$, $s_2^2$ is a sufficient set. The solution of the maximum likelihood equations is more complicated here. We note only that the maximum likelihood estimates are connected by the relation

$$ \hat\mu=\frac{2m_1+\hat\varkappa m_2}{2+\hat\varkappa}. \tag{10} $$

In estimating $\sigma^2$ (or $\varkappa$), it is natural to restrict oneself to statistics that do not depend on the choice of the origin on the $\xi$ axis. A sufficient set of such statistics can be composed, for example, of the three statistics $s_1^2$, $s_2^2$, $(m_1-m_2)^2$. The case of large $\varkappa$ presents no difficulty, since the following holds:

Theorem 2. As $\varkappa\to\infty$, the estimates

$$ \mu \sim m_2,\qquad \sigma^2 \sim s_2^2 $$

are jointly asymptotically efficient, and the distribution of the ratios

$$ \frac{m_2-\mu}{2s_2^2},\qquad \frac{s_2^2-\sigma^2}{s_2^2\sqrt{2/\varkappa}} \tag{11} $$

tends to the $\left(0,0;\begin{pmatrix}1&0\\0&1\end{pmatrix}\right)$-normal distribution.

As $\varkappa\to 0$, the statistics $m_1$ and $m_2$ are asymptotically equivalent, while the pair of statistics $s_1^2$ and $s_2^2$ is asymptotically independent of the parameters and of the statistics $m_1$ and $m_2$. This leads to almost complete indeterminacy of the parameter $a$ and to the impossibility of estimating the parameter $\varkappa$ from below (i.e., the parameter $\sigma^2$ from above).

Theorem 3. Let $\alpha>1/2$ and let $\underline{\mu}(\xi)$, $\overline{\mu}(\xi)$ be actual, or equal to $-\infty$, or $+\infty$, functionals continuous in the metric $C_{[0,1]}$* on the space $R_\xi$, satisfying for all $\mu$ and $\varkappa$ the conditions

$$ \mathbf P\{\mu\geqslant \underline{\mu}(\xi)\}\geqslant \alpha,\qquad \mathbf P\{\mu\leqslant \overline{\mu}(\xi)\}\geqslant \alpha . $$

Then

$$ \mathbf P\{\underline{\mu}(\xi)=-\infty\}\geqslant f(\varkappa,\alpha),\qquad \mathbf P\{\overline{\mu}(\xi)=+\infty\}\geqslant f(\varkappa,\alpha), $$

where the positive function $f$ does not depend on the choice of the functionals, and

$$ f(\varkappa,\alpha)\to \tfrac12 \qquad \text{as } \varkappa\to 0 . $$

Theorem 4. Let $\alpha>0$ and let $\underline{\varkappa}(\xi)$ be a nonnegative functional continuous in the metric $C_{[0,1]}$ on the space $R_\xi$, satisfying for all $\mu$ and $\varkappa$ the condition

$$ \mathbf P\{\varkappa\geqslant \underline{\varkappa}(\xi)\}\geqslant \alpha . $$

Then

$$ \mathbf P\{\underline{\varkappa}(\xi)=0\}\geqslant g(\varkappa,\alpha), $$

where the positive function $g$ does not depend on the choice of the functional, and

$$ g(\varkappa,\alpha)\to 1 \qquad \text{as } \varkappa\to 0 . $$

* Continuity for functionals taking infinite values is understood as continuity induced by the topology on the extended line supplemented by the points $\pm\infty$. $\underline{\mu}(\infty)>-\infty$, $\overline{\mu}(-\infty)<\infty$.

Naturally, as \(\varkappa \to \infty\), the functions \(f\) and \(g\), for any fixed \(\alpha<1\), tend to zero. This follows from the possibility of obtaining asymptotically correct confidence bounds for the parameters in accordance with Theorem 2.

We shall subsequently expect to return to methods of constructing confidence bounds for \(\mu\) and \(\varkappa\) \((\sigma^2)\), suitable without any restrictions on the values of the parameter \(\varkappa\). The method for obtaining them is analogous to the method, set forth in § 2, for constructing confidence bounds for \(\sigma^2\) \((\varkappa)\) with known \(\mu\). But in the case of two unknown parameters this construction naturally leads to the peculiarities noted in Theorems 3 and 4.

§ 4. For applications it is convenient to return to the case of arbitrary \(T\) and \(a\). We give an example of such a formulation, referring to the question of estimating the parameters \(\mu\) and \(\lambda\). This is an obvious consequence of Theorem 2 (the passage from the parameter \(\sigma^2\) to \(\lambda\) presents no difficulty).

Theorem 5. If \(\varkappa=\lambda T \to \infty\), the estimates

\[ \mu \sim m_2, \qquad \lambda \sim b=\frac{a}{2s_2^2} \]

are jointly asymptotically efficient, and the distribution of the ratios

\[ \frac{m_2-\mu}{\sqrt{2\sigma^2/\lambda T}}, \qquad \frac{b-\lambda}{\sqrt{2\lambda/T}} \]

tends to the \(\left(0,0;\begin{vmatrix}1&0\\0&1\end{vmatrix}\right)\)-normal distribution.

  1. In conclusion we note that our problem is closely connected with the problem of estimating the parameters of a Gaussian Markov process with discrete time. Let, for such a process,

\[ \mathbf M \xi_n=\mu, \qquad \mathbf M(\xi_n-\mu)^2=\sigma^2, \qquad \mathbf M(\xi_{n+1}-\mu)(\xi_n-\mu)=\sigma^2\rho . \]

There are now three parameters: \(\mu\), \(\sigma^2\), and \(\rho\), and five sufficient statistics (see (3)):

\[ m_1=\frac12(\xi_1+\xi_N), \qquad m_2=\frac1N\sum_1^N \xi_n, \]

\[ s_1^2=\frac14(\xi_N-\xi_1)^2, \qquad s_2^2=\frac1N\sum_{n=1}^N(\xi_n-m_2)^2, \]

\[ R=\sum_1^{N-1}(\xi_{n+1}-m_2)(\xi_n-m_2). \]

The limiting transition to our problem is carried out when \(N\to\infty\), \(\rho\to1\). In this case \(\varkappa=N(1-\rho^2)\) passes into the parameter \(\varkappa\) of our problem. Along this path one may hope to obtain a completion of the investigations of L. Le Cam \((^4)\).

I express my gratitude to my adviser, Academician A. N. Kolmogorov, and to Ya. G. Sinai for their attention and help.

Moscow State University
named after M. V. Lomonosov

Received
20 II 1962

REFERENCES

  1. J. L. Doob, Stochastic Processes, Moscow, 1956.
  2. Ch. T. Striebel, Ann. Math. Statistics, 30, No. 2, 559 (1959).
  3. Yu. V. Linnik, Izv. Acad. Sci. USSR, Ser. Math., 14, No. 6, 501 (1950).
  4. L. Le Cam, Dokl. Akad. Nauk SSSR, 98, No. 5, 723 (1954).

Submission history

MATHEMATICS