Full Text
MATHEMATICS
Corresponding Member of the Academy of Sciences of the USSR Yu. V. LINNIK
REMARKS ON THE FISHER—WELCH—WALD TEST
The well-known Fisher—Welch—Wald test (see the literature in (¹) and the paper of V. Wald (²)) tests the hypothesis \(H_0\) on the equality of the means of two normal repeated samples \(x_1,\ldots,x_n \in N(a_1,\sigma_1)\); \(y_1,\ldots,y_n \in N(a_2,\sigma_2)\). It is nonrandomized, and its critical region has the form
\[ \frac{|\bar{x}-\bar{y}|}{\sqrt{s_1^2+s_2^2}}\geq \Phi\left(\frac{s_1}{s_2}\right), \tag{1} \]
where \(\bar{x}, \bar{y}, s_1, s_2\) are the usual notations for sufficient statistics for the four parameters \(a_1, a_2, \sigma_1, \sigma_2\). Here all parameters are unknown and in general \(\sigma_1\ne\sigma_2\); we have taken the sample sizes to be equal; \(\Phi(x)\) is a single-valued function measurable in the Lebesgue sense for \(x\geq 0\).
In the statistical literature the question has repeatedly been considered of the possibility of choosing the function \(\Phi(x)\) so that the test (1) would be similar with respect to all values of \(\sigma_1\) and \(\sigma_2\). An attempt to obtain such a function \(\Phi(x)\) by means of a formal expansion in a series was made by B. Welch (³); A. Wald (²) constructed such an analytic function \(\Phi(x)\), for which the test (1) is approximately similar for a certain segment of values of \(\sigma_1/\sigma_2\).
The question of the existence of a similar test (1) with a measurable function \(\Phi(x)\) remains open and presents an interesting problem of analytic statistics. The author’s results (⁴) show that, in the class of functions \(\Phi(x)\) having a finite first derivative \(\Phi'(x)\) for \(x\in(0,1)\) and satisfying the Lipschitz condition \(\operatorname{Lip}^{(1)}\) on a sufficiently large segment \([0,\eta_0]\), nonrandomized similar tests of the form (1) do not exist (except, of course, for the trivial test obtained when \(\Phi(x)\equiv 0\)). This makes plausible the nonexistence of a similar test (1) also in the general case of measurable \(\Phi(x)\). In this connection it is curious to note that if one admits randomization inside the critical region (1), then one can obtain a very simple randomized similar test of type (1), quite satisfactory with respect to power. In this sense the situation is analogous to the well-known facts in the theory of a two-person antagonistic game, namely that a solution of the game may not exist in pure strategies, but under very general conditions exists in mixed strategies.
We briefly indicate the construction of the randomized test mentioned. It is very simple. The well-known nonrandomized test of M. Bartlett (⁵) for testing the above \(H_0\) has a critical region of the form
\[ \frac{|\bar{x}-\bar{y}|}{\left(\sum_{i=1}^{n}\left[(x_i-\bar{x})^2+(y_i-\bar{y})^2\right]\right)^{1/2}}\geq C_0. \tag{2} \]
It is, obviously, similar with respect to \(\sigma_1\) and \(\sigma_2\) and is computed with the aid of Student’s distribution. Denote by \(\chi(x_1,\ldots,x_n;y_1,\ldots,y_n)\) the indicator function of the critical region of the test (2), and consider the “projection” of the function \(\chi\) onto the space of sufficient statistics:
\[ \psi(\bar{x},\bar{y},s_1,s_2)=E(\chi\mid \bar{x},\bar{y},s_1,s_2). \tag{3} \]
We shall obtain a new test \(\psi\), which will already be randomized, but also similar and with the same level. Being an average of the original test over fixed sufficient statistics, it has properties with respect to power no worse than the original test (2), which, as is known, is quite satisfactory \((^5)\).
Put \(\xi=\dfrac{\bar{x}-\bar{y}}{s_2}\), \(\eta=\dfrac{s_1}{s_2}\). Then test (2) takes the form
\[ \frac{\xi^2}{\eta^2-2r\eta+1}\geq C_1^2 \tag{4} \]
(\(C, C_0, C_1,\ldots\) are positive constants). Here
\[ r=\left[\sum_{i=1}^{n}(x_i-\bar{x})(y_i-\bar{y})\right]\Bigg/\left[\left(\sum_{i=1}^{n}(x_i-\bar{x})^2\sum_{i=1}^{n}(y_i-\bar{y})^2\right)^{1/2}\right], \]
the sample correlation coefficient for the normal samples \(x_1,\ldots,x_n;\ y_1,\ldots,y_n\). As is known (see \((^5)\), p. 163), the random variable \(r\) is stochastically independent of the random vector \((\bar{x},\bar{y},s_1,s_2)\), and hence all the more of the vector \((\xi,\eta)\). In view of the fact that the population correlation coefficient here is equal to 0, \(r\) has probability density
\[ f_n(r)= \frac{\Gamma\left(\dfrac{n-1}{2}\right)} {\Gamma\left(\dfrac{n-2}{2}\right)\sqrt{\pi}} (1-r^2)^{1/2(n-4)},\qquad -1\leq r\leq 1. \tag{5} \]
Taking the above into account, it is easy to carry out the computation of the “projection” (3). One obtains a test with a critical region of the form
\[ \frac{|\bar{x}-\bar{y}|}{\sqrt{s_1^2+s_2^2}} \geq C\, \frac{|s_1/s_2-1|}{\sqrt{1+(s_1/s_2)^2}} \tag{6} \]
or, in the coordinates \(\xi=\dfrac{\bar{x}-\bar{y}}{s_2}\), \(\eta=\dfrac{s_1}{s_2}\),
\[ |\xi|\geq C|\eta-1|,\qquad \eta\geq 0. \tag{7} \]
Here \(C\geq 0\) is any constant determined by the level. The critical region (6) (or (7)) is randomized in the following way: when falling outside the critical region, we accept the null hypothesis \(H_0\); when falling inside it, we form the function
\[ \tau(\xi,\eta)=\frac12\left(\eta+\frac1\eta-\frac1{C^2}\frac{\xi^2}{\eta}\right). \tag{8} \]
Inside the critical region \(\tau(\xi,\eta)\leq 1\). Put
\[ \Phi(\xi,\eta)= \int_{\max(\tau(\xi,\eta),-1)}^{1} f_n(r)\,dr \tag{9} \]
and perform a randomization with probability, equal to \(\Phi(\xi,\eta)\), of rejecting the null hypothesis \(H_0\). We obtain a similar test of the same level as the previous one and, as stated above, no worse with respect to power.
The boundary of the randomized critical region (7) in the quadrant \(\xi\geq 0\), \(\eta\geq 0\) has the form of a rectilinear angle with vertex resting at \((0,1)\) and with its bisector parallel to the axis \(O\xi\). Let us also note that (7) can be written in the form
\[ \left|\frac{\bar{x}-\bar{y}}{s_1-s_2}\right|\geq C. \]
If the sample sizes are \(n = 4\), then from (5) we see that \(f_n(r) = 1/2\) for \(|r| \leqslant 1\), so that in this case the “density” for the drawing is constant. Let us also note that, for computing (9), one may use tables of Student’s distribution with \(n - 1\) degrees of freedom, since the quantity \((n - 1)^{1/2} r (1 - r^2)^{-1/2}\) obeys this distribution. Thus we see that, under quite mild conditions, a similar (and, moreover, unbiased) Fisher–Welch–Wald test for the boundary of the critical region does not exist, whereas allowing a comparatively simple randomization yields the very simple test (7).
Received
19 XI 1963
REFERENCES
- H. Breny, Trabajos de Estatistica, 6, Cuaderno 11, 1955, p. 111.
- A. Wald, Selected Papers in Prob. and Statistics, N. Y., 1955, p. 669.
- B. L. Welch, Biometrika, 34, 28 (1947).
- Yu. V. Linnik, DAN, 152, No. 3 (1963).
- E. Lehmann, Testing Statistical Hypotheses, N. Y., 1959.