Mathematics
Corresponding Member of the Academy of Sciences of the USSR Yu. V. Linnik
Submitted 1964-01-01 | RussiaRxiv: ru-196401.60243 | Translated from Russian

Full Text

Mathematics

Corresponding Member of the Academy of Sciences of the USSR Yu. V. Linnik

STATISTICAL PROBLEMS WITH NUISANCE PARAMETERS

Many parametric problems of modern mathematical statistics pertain to exponential families of distributions. We shall consider families of distributions in an \(n\)-dimensional Euclidean space \(E_n\), specified by a probability element in the space of sufficient statistics \(T_1,\ldots,T_s\):

\[ dP_\theta=C(\theta_1,\ldots,\theta_s)\exp-(\theta_1T_1+\cdots+\theta_sT_s)\,p(T_1,\ldots,T_s)\,dT_1\cdots dT_s. \tag{1} \]

Here \(\theta_1,\ldots,\theta_s\) are parameters; \(p(T_1,\ldots,T_s)\) is a density with respect to Lebesgue measure; we shall assume it to be Riemann integrable. The set of values of the parameters \(\Omega=\{\theta_1,\ldots,\theta_s\}\) will be taken from the Euclidean space \(E_s\). We shall regard it as natural, i.e. consisting of all values that ensure the absolute convergence of \(\int_{E_n} dP_\theta\). It is known (see (1)) that it is convex.

The usual statistical problems concerning the family (1), with nuisance parameters, reduce to testing for the existence among the parameters \(\theta_1,\ldots,\theta_s\) of a finite number of polynomial relations (for example, the well-known Behrens–Fisher problem reduces to testing the hypothesis \(H_0:\theta_1\theta_4-\theta_2\theta_3=0\)) under a corresponding choice of sufficient statistics and parametrization.

Let us dwell on the hypotheses that we shall test. Let \(\Omega_0\subset\Omega\) be a bounded parallelepiped. The hypothesis \(H_0\) will consist in the assertion that among the parameters \(\theta_1,\ldots,\theta_s\) there exist \(r<s\) relations:

\[ P_1(\theta_1,\ldots,\theta_s)=0;\quad P_2(\theta_1,\ldots,\theta_s)=0;\ \ldots;\quad P_r(\theta_1,\ldots,\theta_s)=0, \tag{2} \]

where \(P_i(\theta_1,\ldots,\theta_s)\) are polynomials with real coefficients, homogeneous and absolutely irreducible (i.e. irreducible over the field \(C\) of all complex numbers).

Tests of the hypothesis \(H_0\) against any simple or composite alternatives \(H_1\) (of course, also concerning the behavior of the parameters \(\theta_1,\ldots,\theta_s\)) will be conducted on a single sample corresponding to the family (1), with parameters \((\theta_1,\ldots,\theta_s)\in\Omega_0\). In doing so we shall use test functions \(\phi\) (tests) subject to the conditions:

A. The test \(\phi\) may be randomized.

B. The test \(\phi\) is defined in the space of sufficient statistics:

\[ \phi=\phi(T_1,\ldots,T_s). \]

C. The test \(\phi\) excludes nuisance parameters, i.e. is a similar test under the hypothesis \(H_0\):

\[ E(\phi\mid H_0)=\alpha, \tag{3} \]

where \(\alpha\in(0,1)\) is the level of the test, independent of the parameters.

We shall consider here the question of as complete as possible a description of such tests and of selecting from among them those that are optimal in one or another sense.

One can give a number of weighty statistical arguments to justify restrictions A, B, C, but we shall not dwell on them here.

Apparently, in the most general formulation one should consider not only tests $\phi$ without these restrictions, but also “generalized tests,” defined not as functions of a point of the sample space, but as functionals on families of sufficiently smooth “basic functions” of the L. Schwartz type, specified on the sample space. Such a point of view has its advantages, especially for exponential families. However, we shall consider this question in another paper.

The question of the optimal choice of the test $\phi$, as is usual in statistics, will depend on what kind of optimization is assumed. In particular, when the alternative $H_1$ is a simple hypothesis, or, more generally, when we have a Bayesian situation with alternatives subject to a prior distribution with probability element $dB(\theta)$, vanishing on $\Omega_0$ in some neighborhood of $H_0$, it is natural to require the condition:

\[ E(\phi \mid H_1)=\int_{E_n\times\Omega_0}\cdots\int \phi\,dP_0\,dB(\theta)=\max, \tag{4} \]

where the maximum is taken over all tests $\phi$ satisfying conditions A, B, C (henceforth we shall call such tests admissible). Of course, such a maximum may also fail to exist; therefore, putting

\[ \beta=\sup_{\phi} E(\phi\mid H_1) \tag{5} \]

under the same conditions on the tests $\phi$, we may consider the problem of constructing, for any $\varepsilon>0$, an admissible test $\phi_\varepsilon$ for which $E(\phi_\varepsilon\mid H_1)\geqslant \beta-\varepsilon$. Such tests will be called $\varepsilon$-optimal, and henceforth we shall deal with the problem of constructing $\varepsilon$-optimal admissible tests.

Let us proceed to describe the solution of this problem for the class of exponential families satisfying certain additional requirements (exponential families of type $\mathscr E_0$).

Let the density $p(T_1,\ldots,T_s)$ satisfy the condition:

\[ p(T_1,\ldots,T_s)<\exp A_0\left(|T_1|+\cdots+|T_s|\right) \tag{6} \]

for $|T_i|\geqslant 1$ ($A_0>0$ is a constant).

Such a condition ensures the nondegeneracy of the set $\Omega$ of values of the parameters $\Omega$ in the space $E_s$. Further, suppose that the set where $p(T_1,\ldots,T_s)$ vanishes consists of those and only those points $T_1,\ldots,T_s$ where at least one of the $k$ inequalities

\[ T_i\leqslant c_i, \tag{7} \]

where $i\leqslant k$ and $c_i$ are finite numbers, is satisfied. We shall also require that the inequality

\[ p(T_1,\ldots,T_s)>\left|(T_1-c_1)\cdots(T_k-c_k)\right|^{K_0} \tag{8} \]

be satisfied for sufficiently small values of $T_i-c_i$; $K_0>0$ is a constant.

Such families will be called families of type $\mathscr E_0$. They include most of the known problems of testing hypotheses with nuisance parameters in exponential families. In what follows, we shall assume that we are dealing with the indicated family $\mathscr E_0$; the theorem stated below will also apply to it.

Along with admissible tests $\phi$ it is convenient to consider cotests. A cotest for a given test $\phi$ will be a collection of functions of the form

\[ \psi=K(\phi-\alpha), \tag{9} \]

where $\alpha$ is the level of the test (the right-hand side of relation (3)), and $K$ is any constant. For every cotest we have:

\[ E(\psi\mid H_0)=0. \tag{10} \]

To describe the \(\varepsilon\)-optimal tests it is sufficient to describe the corresponding \(\varepsilon\)-optimal cotests. Consider the absolutely irreducible polynomials (2) and multiply them. We obtain a homogeneous polynomial \(\Pi(\theta_1,\ldots,\theta_s)\) of degree \(h\). Divide it by \((T_1\cdots T_s)^{h+2}\). We obtain a new polynomial \(F\left(\dfrac{1}{\theta_1},\ldots,\dfrac{1}{\theta_s}\right)\).

In it, replace each power \(\theta_j^{-s_j}\) by \((-1)^{s_j-1}T_j^{s_j-1}\). We obtain a new polynomial \(F_0(T_1,\ldots,T_s)\).

Theorem. For every \(\varepsilon>0\) one can effectively specify such a bounded parallelepiped \(\mathfrak{S}_{\varepsilon}\) in the space of sufficient statistics \((T_1,\ldots,T_s)\) that the \(\varepsilon\)-optimal cotests will be found among the family of cotests defined by the formula:

\[ \Psi=F_0*H=\int_{\mathfrak{S}_{\varepsilon}}\cdots\int F_0(T_1-x_1,\ldots,T_s-x_s)\,H(x_1,\ldots,x_s)\,dx_1\cdots dx_s, \tag{11} \]

where \(H\) is any function in \(L_1\) on \(\mathfrak{S}_{\varepsilon}\).

By virtue of this theorem, the search for \(\varepsilon\)-optimal tests is reduced to a variational problem for the function \(H\): it is sought under the conditions

\[ \int_{E_0\times\Omega_0}\cdots\int (F_0*H)\,dB(\theta)=\max; \]

\[ -\alpha\,p(T_1,\ldots,T_s)\leq F_0*H\leq (1-\alpha)\,p(T_1,\ldots,T_s). \tag{12} \]

The method of proof was suggested to the author by the very interesting work of R. A. Wijsman \((^2)\), where special cases of the construction of such tests are given. The method of proof of the theorems is based on the construction and study of ideals formed by cotests.

Received
13 IV 1964

REFERENCES

\(^1\) E. Lehmann, Testing Statistical Hypotheses, N. Y., 1959.
\(^2\) R. A. Wijsman, Ann. Math. Statistics, 29, No. 4 (1958).

Submission history

Mathematics