Full Text
UDC 519.281
MATHEMATICS
A. M. KAGAN
THE SAMPLE MEAN AS AN ESTIMATE OF A SHIFT PARAMETER
(Presented by Academician Yu. V. Linnik on 24 XI 1965)
1°. Consider a family \(\{F(x-\theta)\}\) of one-dimensional distribution functions (d.f.'s) depending on a shift parameter \(\theta \in R_1\); we shall assume that
\[ \int x\,dF(x)=0, \tag{1} \]
\[ \int x^2\,dF(x)<\infty . \tag{2} \]
From condition (1) it follows that
\[ \int x\,dF(x-\theta)=\theta, \]
so that the parameter \(\theta\) is equal to the mean value of the population with d.f. \(F(x-\theta)\).
Let \((x_1,\ldots,x_n)\) be a repeated sample from the population with d.f. \(F(x-\theta)\). Then it is well known that the sample mean \(\bar{x}=(x_1+\cdots+x_n)/n\) is an unbiased estimate of \(\theta\), whose variance is minimal in the class of unbiased estimates of the form \(\sum_{1}^{n} c_i x_i\). This result goes back to Gauss and Markov.
On the other hand, for the family of normal distributions \(\{\Phi(x-\theta)\}\)
\[ \Phi(x)=\frac{1}{\sqrt{2\pi}\,\sigma}\int_{-\infty}^{x} e^{-u^2/2\sigma^2}\,du \]
\(\bar{x}\) is an absolutely best unbiased estimate of \(\theta\), i.e., it has minimal variance in the class of all unbiased estimates of \(\theta\). This also well-known fact is derived from the Rao—Blackwell—Kolmogorov theorem and the completeness of the sufficient statistic \(\bar{x}\) for the family \(\{\Phi(x_1-\theta)\ldots \Phi(x_n-\theta)\}\) \((^1)\). Recently Linnik, Rao, and the author established that the latter property of \(\bar{x}\) is characteristic. Namely, in \((^2)\) it is proved that if for some \(n \ge 3\) \(\bar{x}\) turns out to be an admissible estimate of the shift parameter in the class of unbiased estimates, then the family is normal. In \((^2)\), as in the present work, the quality of various estimates of the parameter \(\theta\) is measured by the variance of these estimates, and the notions of admissibility or optimality of an estimate refer to this choice of measure of quality.
2°. Yu. V. Linnik posed the problem of constructing classes \(P_1 \subset P_2 \subset \cdots \subset P_k \subset \cdots\) of unbiased estimates of a shift parameter such that the optimality of \(\bar{x}\) in the class \(P_k\) would be equivalent to the “closeness of order \(k\)” of the family to some family of normal distributions. The class \(P_1\) must coincide with the class of linear estimates; naturally it is also desirable that the optimality of \(\bar{x}\) in the class \(P_\infty=\bigcup_{1}^{\infty}P_k\) be equivalent to normality of the family.
Let us note first of all the following:
1) Every linear unbiased estimator of the parameter \(\theta\) can be written in the form
\[ \sum_{1}^{n} c_i x_i=\bar{x}+\sum_{2}^{n} a_j(x_j-x_1). \]
2) If, for \(n \geqslant 3\), \(\bar{x}\) turns out to be an admissible estimator of \(\theta\) in the class of unbiased estimators of the form
\(\bar{x}+\psi(x_2-x_1,\ldots,x_n-x_1)\), then the family is normal. In essence, precisely this result was proved in (2).
The last two remarks make clear the choice, as the class \(P_k\), of the set of all estimators of the form
\[ \begin{aligned} \bar{x}+\psi(x_2-x_1,\ldots,x_n-x_1) &=\bar{x}+c+\sum_j a_j(x_j-x_1)+ \\ &\quad+\sum_{j_1,j_2} a_{j_1j_2}(x_{j_1}-x_1)(x_{j_2}-x_1)+\cdots+ \\ &\quad+\sum_{j_1,\ldots,j_k} a_{j_1\ldots j_k}(x_{j_1}-x_1)\cdots(x_{j_k}-x_1), \end{aligned} \tag{3} \]
where the summation in \(\sum_{j_1,\ldots,j_k}\) is carried out independently over \(j_1,\ldots,j_k\) from \(2\) to \(n\), and the coefficients \(c,\{a_j\},\{a_{j_1,j_2}\},\ldots,\{a_{j_1\ldots j_k}\}\) satisfy the single condition ensuring unbiasedness of the estimator \(\bar{x}+\psi(x_2-x_1,\ldots,x_n-x_1)\):
\[ E_0\psi(x_2-x_1,\ldots,x_n-x_1)=0. \tag{4} \]
By the symbol \(E_0\) we denote the mathematical expectation corresponding to \(\theta=0\), i.e., to the d.f. \(F(x)\).
If the family has a finite moment of order \(2k\),
\[ \mu_{2k}=\int x^{2k}\,dF(x)<\infty, \tag{5} \]
then the totality of all functions \(\psi(x_2-x_1,\ldots,x_n-x_1)\) of the form (3), satisfying condition (4), forms a Hilbert space with the ordinary scalar product
\((\psi_1,\psi_2)=E_0(\psi_1\psi_2)\). We shall denote this space by \(\Lambda_k\).
3°. Theorem. If the family \(\{F(x-\theta)\}\) satisfies condition (5), then, for \(n \geqslant 3\), \(\bar{x}\) will be an admissible estimator of the parameter \(\theta\) in the class \(P_k\) of unbiased estimators of the form (3) if and only if the first \((k+1)\) moments of the d.f. \(F(x)\) coincide with the corresponding moments of some normal distribution.
The proof of this theorem proceeds according to the following scheme. Consider the estimator
\[ \varphi(x_1,\ldots,x_n)\equiv\varphi=\bar{x}-\hat{E}_0(\bar{x}\mid\Lambda_k), \tag{6} \]
where \(\hat{E}_0(\cdot\mid\Lambda_k)\) is the projection operator onto \(\Lambda_k\) (conditional mathematical expectation in the broad sense). Estimator (6) is an analogue of the Pitman estimator of the shift parameter \((^3)\).
It is easy to establish:
Lemma 1.
\[ E_\theta\varphi=\theta,\qquad E_\theta(\varphi-\theta)^2\leqslant E_\theta(\bar{x}-\theta)^2, \]
with equality or inequality holding simultaneously for all \(\theta\in R_1\), and equality holds if and only if
\(\hat{E}_0(\bar{x}\mid\Lambda_k)=0\).
By direct calculation one proves
Lemma 2. If \(n\geqslant 3\), then \(\hat{E}_0(\bar{x}\mid\Lambda_k)=0\) if and only if the first \((k+1)\) moments of the d.f. \(F(x)\) coincide with the corresponding moments of some normal law.
The theorem follows immediately from Lemmas 1 and 2.
Remark 1. The first two moments of any d.f. satisfying condition (2) coincide with the corresponding moments of some normal law. This corresponds to the fact that in the class \(P_1\) of estimates of the form \(\bar{x}+\sum_j a_j(x_j-x_1)\), the mean \(\bar{x}\) is always admissible.
If it is assumed that for all \(k=1,2,\ldots\)
\[
\int x^k\,dF(x)<\infty
\]
and \(\bar{x}\) is admissible in each of the classes \(P_k\), i.e., admissible in the class \(P_\infty=\bigcup_1^\infty P_k\), then it follows from the theorem that all moments of the d.f. \(F(x)\) coincide with the corresponding moments of some normal distribution. It is well known that in this case \(F(x)\) is the d.f. of a normal law.
Remark 2. It is easy to see that if \(\bar{x}\) is admissible in the class \(P_k\), then it is also the best estimate in this class.
The case \(n=2\), as in (2), turns out to be special. Namely, it can be shown that for \(n=2\) the admissibility of \(\bar{x}\) in the class \(P_k\) is equivalent to the vanishing of the odd moments of the d.f. \(F(x)\) up to order \(k+1\); no restrictions on the even moments are imposed by the admissibility of \(\bar{x}\) in the class \(P_k\) when \(n=2\).
The estimate (6) is also of interest in itself; problems can be posed for it in the spirit of those considered in papers \((^2,^4)\).
Leningrad Branch
of the V. A. Steklov Mathematical Institute
of the Academy of Sciences of the USSR
Received
22 XI 1965
REFERENCES
\(^1\) E. Lehmann, Testing Statistical Hypotheses, 1964.
\(^2\) A. M. Kagan, Yu. V. Linnik, C. R. Rao, Sankhya, 1965.
\(^3\) E. Pitman, Biometrika, 30, III–IV (1939).
\(^4\) C. Stein, Ann. Math. Stat., 30, 4 (1959).