Full Text
UDC 519.21
MATHEMATICS
D. S. Apokorin
ASYMPTOTIC BEHAVIOR OF THE PROBABILITIES OF ERRORS OF THE FIRST AND SECOND KIND IN TESTING HYPOTHESES ON THE SPECTRUM OF A GAUSSIAN STATIONARY PROCESS
(Presented by Academician A. N. Kolmogorov on 20 VI 1965)
Let on the interval \([0,T]\) there be given a real-valued function which is a realization of a Gaussian stationary process \(x(t)\) with zero mean. Suppose that, concerning this process, there are two competing hypotheses: hypothesis \(H_1\), according to which the process \(x(t)=x_1(t)\) has correlation function \(b_1(t)\), and hypothesis \(H_2\), according to which \(x(t)=x_2(t)\) and its correlation function is equal to \(b_2(t)\). It is known \((^{1,2})\) that the Gaussian measures \(P_1\) and \(P_2\) corresponding to the hypotheses \(H_1\) and \(H_2\) may be either equivalent or orthogonal. In the first of these cases, when distinguishing the hypotheses \(H_1\) and \(H_2\), the probability of error will always lie within the limits \(0<p<1\), while in the second the hypotheses can be distinguished without error. In other words, if we consider the random vector \(x^{(n)}=\{x(kT/n),\ k=0,1,\ldots,n\}\) and compare the hypotheses \(H_1^{(n)}\) and \(H_2^{(n)}\), according to the first of which the correlation matrix of this vector is equal to \(\|b_1[T(i-j)/n]\|\), and according to the second is equal to \(\|b_2[T(i-j)/n]\|\), then in the case of equivalence of the measures \(P_1\) and \(P_2\) the corresponding probabilities of errors of the first and second kind \(\alpha_n\) and \(\beta_n\) will necessarily tend to limits different from zero and one as \(n\to\infty\), while in the case of orthogonality of these measures they may simultaneously tend to zero.
In \((^{1,2})\) necessary and sufficient conditions for the equivalence and orthogonality of the measures \(P_1\) and \(P_2\) are indicated; however, these conditions are ineffective, i.e. they cannot be checked directly from the known functions \(b_1(t)\) and \(b_2(t)\). At present there are also known numerous quite effective sufficient conditions for equivalence and orthogonality, formulated in terms of the correlation functions \(b_1(t)\) and \(b_2(t)\), or of their Fourier transforms (assumed to exist)—the spectral densities \(f_1(\lambda)\) and \(f_2(\lambda)\) (see, for example, \((^{2-5})\)). In particular, if both spectral densities \(f_1(\lambda)\) and \(f_2(\lambda)\) have a power asymptotic behavior at infinity (more precisely, this condition will be formulated below), then, according to the results of E. G. Gladyshev \((^5)\) and V. G. Alekseev \((^4)\), the measures \(P_1\) and \(P_2\) will be orthogonal if
\[ \lim_{|\lambda|\to\infty} f_1(\lambda)/f_2(\lambda)=\infty, \]
or
\[ \lim_{\lambda\to\infty} f_1(\lambda)/f_2(\lambda)=a\ne 1, \]
or, finally,
\[ \lim_{\lambda\to\infty} f_1(\lambda)/f_2(\lambda)=1, \quad\text{but}\quad \lim_{\lambda\to\infty}\left|1-f_1(\lambda)/f_2(\lambda)\right|\lambda^\delta>0, \quad \text{where } \delta\le \tfrac12. \]
Let us now note that in concrete applications, even under the condition of orthogonality of the measures \(P_1\) and \(P_2\), error-free distinction between the hypotheses \(H_1\) and \(H_2\) still cannot be realized. Indeed, such distinction is usually based on testing certain local properties of realizations of the processes \(x_1(t)\) and \(x_2(t)\), requiring absolutely precise information about all details of the behavior of the function given to us on arbitrarily small time intervals. In reality, however, observation of the values of the function is always performed with some error, and therefore the notion of the possibility of assigning exact values \(x(t)\) for all \(t\) is only mathe-
a mathematical idealization (cf. in this connection \((^{5,7})\)). Therefore, in the case of orthogonal measures \(P_1\) and \(P_2\), it seems more justified to restrict oneself to comparing the hypotheses \(H_1^{(n)}\) and \(H_2^{(n)}\) for finite (and even not very large) values of \(n\), where neglecting the inaccuracy in the prescribed values \(x(t)\) will no longer lead to a substantial distortion of the results. At the same time, of considerable interest is the calculation, for such \(n\), of the probabilities of errors of the first and second kind, \(\alpha_n\) and \(\beta_n\), corresponding to optimal methods of comparing \(H_1^{(n)}\) and \(H_2^{(n)}\) (orthogonality of the measures \(P_1\) and \(P_2\), of course, means that \(\alpha_n \to 0\), \(\beta_n \to 0\) as \(n \to \infty\)). It is clear that it is difficult to expect that exact analytic expressions can be found for \(\alpha_n\) and \(\beta_n\) for all values of \(n\); however, already the determination of the asymptotic behavior of \(\alpha_n\) and \(\beta_n\) as \(n \to \infty\) may be of practical interest, since asymptotic formulas of this kind usually turn out to be quite accurate even when applied to comparatively small values of \(n\). The present paper is devoted precisely to this problem.
It is known that, in the case of orthogonality of the measures \(P_1\) and \(P_2\), when comparing the hypotheses \(H_1^{(n)}\) and \(H_2^{(n)}\), corresponding to the choice of the critical set \(X_n \subseteq R_n\), the probabilities of errors of the first and second kind \(\alpha_n = P_2\{x^{(n)} \in X_n\}\) and \(\beta_n = P_1\{x^{(n)} \notin X_n\}\) will tend to zero for some sequence of sets \(X_n\), constructed by the Neyman–Pearson method, i.e. specified by the conditions
\[ X_n=\left\{x^{(n)}:\frac{\rho_2(x_0,\ldots,x_n)}{\rho_1(x_0,\ldots,x_n)}<\widetilde B_n\right\} =\left\{x^{(n)}:(B_2^1x,x)-(B_1^{-1}x,x)>B_n\right\}, \tag{1} \]
where \(\rho_i(x_0,\ldots,x_n)=(2\pi)^{-n/2}\exp[-\tfrac12(B_i^{-1},x,x)]\) is the probability density of the vector \(x^{(n)}\) under the condition that the hypothesis \(H_i^{(n)}\) is true, and \(B_n\), \(n=1,\ldots\), is some numerical sequence (see, for example, \((^8)\)). In what follows we shall confine ourselves to such critical sets.
Suppose that the processes \(x_1(t)\) and \(x_2(t)\) have bounded spectral densities \(f_1(\lambda)\) and \(f_2(\lambda)\), satisfying the following conditions: 1) for some \(\alpha>0\) and \(\delta \ge 0\) there exist
\[ \lim_{|\lambda|\to\infty} f_1(\lambda)\lambda^{1+\alpha}=c_1>0 \quad\text{and}\quad \lim_{|\lambda|\to\infty} f_2(\lambda)\lambda^{1+\alpha+\delta}=c_2>0; \]
2) if \(\delta=0\), then, in addition, either
\[ \lim_{|\lambda|\to\infty} f_1(\lambda)/f_2(\lambda)=a>1, \]
or
\[ f_1(\lambda)=f_2(\lambda)+\xi(\lambda)f_2(\lambda)\quad\text{for }|\lambda|>\lambda_0>0 \]
for some \(\lambda_0\), and there exists a \(\gamma\) belonging to the interval \(0<\gamma\le \tfrac12\) such that
\[ \lim_{|\lambda|\to\infty}\xi(\lambda)\lambda^\gamma=c_3>0. \]
Under these conditions the following theorems are valid.
Theorem 1. If \(\delta>0\) and \(B_n=n^{1+\varkappa_n}\), where \(0<\varkappa_n<\delta\),
\[ \lim_{n\to\infty} n^{\delta-\varkappa_n}=\infty, \]
then, asymptotically as \(n\to\infty\),
\[ -\ln\alpha_n=\tfrac12 n^{1+\varkappa_n}(1+o(1)),\qquad -\ln\beta_n=\tfrac12(\delta-\varkappa_n+o(1))\,n\ln n. \tag{2} \]
If, however, \(B_n=cn^{1+\delta}\) with
\[ c>\frac12\,\frac{c_1}{c_2}(1-\theta)\theta \]
for any \(\theta\) from the interval \(0<\theta<1\), then
\[ -\ln\alpha_n=\tfrac12(c+o(1))n^{1+\delta},\qquad -\ln\beta_n=\tfrac12(A+o(1))n, \tag{3} \]
where the constant \(A\) satisfies the inequalities
\[ B_1-1-\ln B_1<A<(B_2-1-\ln B_2)\theta,\qquad B_1=cc_2/c_1, \]
\[ B_2=B_1(1-\theta)^{-\delta}\theta^{-1}. \]
Theorem 2. If \(\delta=0\), but
\[ \lim_{|\lambda|\to\infty} f_1(\lambda)/f_2(\lambda)=a>1, \]
then, for
\[ B_n=[p(a-1)+q(1-1/a)]n, \]
where \(p>0\) and \(q>0\) are arbitrary numbers such that \(p+q=1\), the asymptotic equalities
\[ -\ln\alpha_n=\tfrac12(h_1-1-\ln h_1+o(1))n, \]
\[ -\ln\beta_n=\tfrac12(h_2-1-\ln h_2+o(1)), \tag{4} \]
hold, where \(h_1=pa+q\), \(h_2=h_1/a\).
Theorem 3. If \(\delta=0\) and \(f_1(\lambda)=f_2(\lambda)+\xi(\lambda)f_2(\lambda)\) for \(|\lambda|>\lambda_0\), then in the case \(\gamma<1/2\), for
\[ B_n=\frac{c}{1-\gamma}\,n^{1-\gamma}+\frac{\tilde c}{1-2\gamma}\,n^{1-2\gamma}, \]
where \(c\) is a constant, whose precise definition will be given below, and \(\tilde c\) satisfies the inequalities \(0<\tilde c<c^2\), the following formulas hold for \(\alpha_n\) and \(\beta_n\):
\[ -\ln \alpha_n=n^{1-2\gamma}(\tilde c+o(1)),\qquad -\ln \beta_n=n^{1-2\gamma}\left(\frac{c^2-\tilde c}{1-2\gamma}+o(1)\right). \tag{5a} \]
If \(\gamma=1/2\), then for \(B_n=2cn^{1/2}-\tilde c\ln n\), where \(c\) and \(\tilde c\) have the same meaning as above, \(\alpha_n\) and \(\beta_n\) satisfy the relations
\[ -\ln \alpha_n=(\tilde c+o(1))\ln n,\qquad -\ln \beta_n=(c^2-\tilde c+o(1))\ln n. \tag{5b} \]
Under the conditions of Theorems 1–3, the order of decrease of the probabilities \(\alpha_n\) and \(\beta_n\) as \(n\to\infty\) cannot be improved by changing the choice of the sequence \(B_n\). The assertions of these theorems can obviously also be reformulated in terms of the quantity \(n(\alpha,\beta)\)—the number of division points of the interval \([0,T]\) into equal parts for which the error probabilities in the Neyman–Pearson criterion for distinguishing the hypotheses \(H_1^{(n)}\) and \(H_2^{(n)}\) are equal to \(\alpha\) and \(\beta\). Let \(\alpha\to0\) and \(\beta\to0\) in such a way that
\[ 0<\sigma_1<\ln\alpha/\ln\beta=\sigma<\sigma_2<\infty . \]
Then, under the conditions of Theorem 2,
\[ n(\alpha,\beta)=(c_0+o(1))\ln(1/\alpha), \tag{6} \]
where \(c_0=1/f(p_0a+1-p_0)\), \(f(x)=x-1-\ln x\), \(0<p_0<1\), and the constant \(p_0\) is the root of the equation
\[ \sigma=\frac{f(pa+1-p)}{f(p+(1-p)/a)}. \]
Under the conditions of Theorem 3, for \(\gamma<1/2\) the following formula is obtained for \(n(\alpha,\beta)\):
\[ n(\alpha,\beta)= \left[\frac{\ln(1/\alpha)(\sigma+1)(1-2\gamma)}{c^2\sigma}\right]^{1/(1-2\gamma)} (1+o(1)), \tag{7} \]
and for \(\gamma=1/2\), the formula is
\[ \ln n(\alpha,\beta)= \frac{\ln(1/\alpha)(\sigma+1)}{c^2\sigma}(1+o(1)). \tag{8} \]
Finally, under the conditions of Theorem 1, for \(n(\alpha,\beta)\) we obtain:
\[ n(\alpha,\beta)= \frac{2}{\sigma\delta}\, \frac{\ln(1/\alpha)}{\ln[\ln(1/\alpha)]}(1+o(1)). \tag{9} \]
Let us also note that the order of decrease of the quantities \(\alpha_n,\beta_n\) as \(n\to\infty\), and consequently the order of growth of \(n(\alpha,\beta)\) as \(\alpha\to0\) and \(\beta\to0\), are preserved if conditions 1) and 2) are replaced by the following more general conditions: 1a) \(f_1(\lambda)\) has order \(\lambda^{-(1+\alpha)}\), \(\alpha>0\), while \(f_2(\lambda)\) has order \(\lambda^{-(1+\alpha+\delta)}\), \(\delta\ge0\) (where we regard a function \(g(\lambda)\) as having order \(\lambda^{-\beta}\) if
\[ \lim_{|\lambda|\to\infty} g(\lambda)\lambda^{\beta+\varepsilon}=\infty,\qquad \lim_{|\lambda|\to\infty} g(\lambda)\lambda^{\beta-\varepsilon}=0 \]
for arbitrary \(\varepsilon>0\)); 2a) if \(\delta=0\), then, in addition, either
\[ \lim_{|\lambda|\to\infty} f_1(\lambda)/f_2(\lambda)=a>1, \]
or \(f_1(\lambda)=f_2(\lambda)+\xi(\lambda)f_2(\lambda)\) for \(|\lambda|>\lambda_0\), where \(\xi(\lambda)\) has order \(\lambda^{-\gamma}\), \(0<\gamma<1/2\), or, finally, there exists
\[ \lim_{|\lambda|\to\infty}\xi(\lambda)\lambda^{1/2}=c_3>0. \]
Under these conditions, however, it is not possible to find exact values for the constants entering the asymptotic formulas (2)—(9).
The proof of Theorems 1–3 is based on the equalities
\[ \beta_n=P_2\{(B_{f_2}^{-1}x,x)-(B_{f_1}^{-1}x,x)>B_n\} = P\{(y,y)-(B_{f_2}^{1/2}B_{f_1}^{-1}B_{f_2}^{1/2}y,y)>B_n\} \]
\[ = P\left\{\sum_{k=0}^{n}(1-\lambda_k^{(n)})y_k^2>B_n\right\}, \]
where \(B_{fk}\) is the covariance matrix of the random variables \(x(Ti/n)\), \(i=0,1,\ldots,n\), when \(x(t)\) is a stationary Gaussian process with spectral density \(f_k(\lambda)\); \(y=\{y_i\}\) is a Gaussian \(n\)-dimensional vector with parameters \((0,E)\); and the constants \(\lambda_k^{(n)}\) are the eigenvalues of the matrix \(B_{f_2}B_{f_1}^{-1}\). Further, with the aid of Courant’s theorem (see (9), p. 256), the asym—
the asymptotic behavior of the numbers \(\mu_k^{(n)} = 1 - \lambda_k^{(n)}\) as \(k \to \infty\). By Courant’s theorem, for \(\lambda_k^{(n)}\) one can write the inequalities
\[ \lambda_k^{(n)} \geq \frac12 \min(\lambda_{n-k}^{-}, \lambda_k^{+});\qquad \lambda_{k+l-1}^{(n)} \leq 2 \max(\lambda_k^{-}, \lambda_l^{+}), \]
where \(\lambda_k^{-}\) and \(\lambda_k^{+}\) are the eigenvalues of the matrices \(B_{f_2}^{-}(B_{f_1}^{-})^{-1}\), \(B_{f_2}^{+}(B_{f_1}^{+})^{-1}\), respectively, and
\[ B_{f_k}^{-}=\sum_{l=-n}^{n} T^l b_l^{(k)},\qquad B_{f_k}^{+}=\sum_{l=-n}^{n} K^l b_l^{(k)}, \tag{10} \]
where \(b_l^k = b_k(Tl/n)\); \(b_k(t)\) is the correlation function of the process \(x_k(t)\); \(K=\|k_{ij}\|\), where \(k_{i,i+1}=1\), \(k_{n,1}=1\), \(k_{i,j}=0\) for the remaining \(i,j\). All elements of the matrix \(T\) coincide with the elements of the matrix \(K\), except \(t_{n,1}=-k_{n,1}\). The eigenvalues of \(K\) and \(T\) are easily written in explicit form (for the matrix \(K\) they coincide with the roots of degree \(n\) of 1), after which the eigenvalues of the matrices \(B_{f_2}^{-}(B_{f_1}^{-})^{-1}\) and \(B_{f_2}^{+}(B_{f_1}^{+})^{-1}\) are found by virtue of the fact that they are rational functions of the matrices \(K\) and \(T\), respectively. Thus it is finally obtained that if \(f_1(\lambda)=\xi(\lambda) f_2(\lambda)\) for \(|\lambda|>\lambda_0\), where \(\xi(\lambda)=c_3 \lambda^{-\gamma}+o(\lambda^{-\gamma})\), then the eigenvalues \(\lambda_k^{(n)}\) satisfy, as \(k\to\infty\), the asymptotic relation
\[ \lambda_k^{(n)}=(c+o(1))k^{-\gamma}, \tag{11} \]
where for \(c\) one can write the inequalities \(\dfrac{1}{2^{\gamma+1}}c_3<c<2^{\gamma+1}c_3\). This estimate of the asymptotic behavior of the eigenvalues of a pair of quadratic forms corresponding to integral operators with kernels having an asymptotically power-type Fourier transform may be of interest also independently of the problem we solve (cf., for example, \({}^{12}\)).
We choose the sequence \(B_n\) from the conditions
\[ M_2\bigl[(B_{f_2}^{-1}x,x)-(B_{f_1}^{-1}x,x)\bigr]<B_n< M_1\bigl[(B_{f_2}^{-1}x,x)-(B_{f_1}^{-1}x,x)\bigr] \]
or, in other words,
\[ \operatorname{Sp}(E-B_{f_2}B_{f_1}^{-2})<B_n< \operatorname{Sp}(B_{f_1}B_{f_2}^{-1}-E), \]
where
\[ M_i(Qx,x)=\int_{R_n}(Qx,x)\rho_i(x_0,\ldots,x_n)\,dx_0\cdots dx_n . \]
After the asymptotic behavior of the eigenvalues \(\mu_k^{(n)}\) has been found, the problem of finding the asymptotic behavior of the probabilities of errors of the first and second kind has been reduced to calculating the asymptotics of the probability of large deviations for the random variables \(z_k=\mu_k y_k^2\), \(k=0,1,\ldots\). In the proof of Theorems 1 and 2 this asymptotics is found by Laplace’s method (see \({}^{10}\)).
In the proof of Theorem 3, the asymptotic behavior of \(\alpha_n\) and \(\beta_n\) is found by Cramér’s method \({}^{11}\). Here it is only necessary to introduce into Cramér’s argument small changes following from the fact that the random variables \(z_k\) are not identically distributed and the sum of their variances has order \(O(n^{1-2\gamma})\), if \(\gamma<\frac12\), or \(O(\ln n)\), if \(\gamma=\frac12\).
The author expresses gratitude to A. M. Yaglom for posing the problem and for constant assistance in the work.
Received
20 VI 1965
REFERENCES
\({}^{1}\) J. Hajek, Czechoslov. Math. J., 8, 610 (1958).
\({}^{2}\) J. Feldman, Pacific J. Math., 8, 699 (1958).
\({}^{3}\) Yu. A. Rozanov, Probability Theory and Its Applications, 9, 448 (1964).
\({}^{4}\) V. G. Alekseev, Izv. AN SSSR, ser. matem., 28, 1083 (1964).
\({}^{5}\) E. I. Gladyshev, Probability Theory and Its Applications, 6, 57 (1961).
\({}^{6}\) D. Slepian, IRE Trans. Inform. Theory, 4, 65 (1958).
\({}^{7}\) V. F. Pisarenko, Radio Engineering and Electronics, 4, 514 (1961).
\({}^{8}\) U. Grenander, Random Processes and Statistical Inference, IL, 1961.
\({}^{9}\) F. Riesz, B. Sz.-Nagy, Lectures on Functional Analysis, Moscow, 1954.
\({}^{10}\) M. A. Evgrafov, Asymptotic Estimates and Entire Functions, Moscow, 1962.
\({}^{11}\) H. Cramér, UMN, vol. 10, 166 (1944).
\({}^{12}\) M. Rosenblatt, J. Math. and Mech., 12, 619 (1963).
\({}^{13}\) D. S. Apokorin, Information Transmission Theory, 16, 35 (1964).