Reports of the Academy of Sciences of the USSR
MATHEMATICS
Submitted 1966-01-01 | RussiaRxiv: ru-196601.49284 | Translated from Russian

Full Text

Reports of the Academy of Sciences of the USSR
1966. Volume 169, No. 4

UDC 519.272.28

MATHEMATICS

Yu. K. BELYAEV

CONFIDENCE INTERVALS FOR FUNCTIONS OF MANY UNKNOWN PARAMETERS

(Presented by Academician A. N. Kolmogorov on 18 XI 1965)

In a number of areas of practice there arises the problem of constructing a confidence interval for a function of many unknown parameters (${}^{1,2}$). In the present paper an algorithmic solution is given for the problem of constructing an upper confidence bound for a concave function

\[ f(\lambda_1,\ldots,\lambda_m)=\sum_{i=1}^{m} f_i(\lambda_i). \]

As the initial statistical data, the values $d_i$, $i=1,\ldots,m$, are used, of mutually independent random variables having Poisson distributions with parameters $\lambda_i$. Denote by $m_d$ the number of $i$ for which $d_i=d$, $d=0,1,\ldots$; $m=\sum_{d=0}^{k} m_d$,

\[ k=\max_{1\leq i\leq m} d_i,\quad \delta(x)=0,\ x=0,\quad \delta(x)=1,\ x>0. \]

The complexity of the proposed algorithm basically increases with

\[ D=\sum_{d=0}^{k}\delta(m_d). \]

In practically important cases, many $\lambda_i<1$, and therefore $m_d>0$ only for small values of $d$. Thus, for $D<10$ the proposed algorithm can be used to solve the problem on computers.

The general statistical problem may be formulated as follows. There is a space $X=\{x\}$ of outcomes of trials and a parameter space $\Theta=\{\theta\}$ determining a family of probability distributions $P_\theta$ on the $\sigma$-algebra $\mathfrak{B}_X$ of subsets of the space $X$. On the product $X\times\Theta$ a function $f(x,\theta)$ is given, $\mathfrak{B}_X$-measurable in $x$ for each $\theta\in\Theta$. It is required, from the observed value $x\in X$, to construct a $\gamma$-confidence interval ($\gamma$-i.) for $f(x,\theta)$, i.e., to find such $\mathfrak{B}_X$-measurable functions $\underline f(x)$, $\bar f(x)$ that

\[ \inf_{\theta\in\Theta} P_\theta\{\underline f(x)\leq f(x,\theta)\leq \bar f(x)\}\geq \gamma. \]

The solution of the problem is carried out on the basis of a chosen system of $\gamma$-confidence sets $\{H_x\}$ ($\gamma$-s.), $H_x\subseteq\Theta$, $x\in X$,

\[ \inf_{\theta\in\Theta} P_\theta\{\theta\in H_x\}\geq\gamma \]

(${}^{3}$).

Theorem 1. If $\{H_x\}$ are $\gamma$-s., then the bounds of the $\gamma$-i. are equal to:

\[ \underline f(x)=\inf_{\theta\in H_x} f(x,\theta),\qquad \bar f(x)=\sup_{\theta\in H_x} f(x,\theta). \tag{1} \]

Corollary. If prior information on the parameter $\theta$, $\theta\in\Theta_0\subseteq\Theta$, is known before the trials, then one can construct a narrower $\gamma$-i. by the formulas

\[ \underline f'(x)=\inf_{\theta\in H_x\cap\Theta_0} f(x,\theta),\qquad \bar f'(x)=\sup_{\theta\in H_x\cap\Theta_0} f(x,\theta). \tag{2} \]

We assume that \( \underline f(x), \overline f(x), \underline f'(x), \overline f'(x) \) are \(\mathfrak B_X\)-measurable. In the particular problem considered below, measurability is a simple consequence of the initial assumptions.

When constructing \(\gamma\)-c.s., it is recommended [2] to use unbiased efficient estimates (for definitions see [3, 4]), if such exist. However, in those cases where \(\Theta\) is a subset of a Euclidean space of large dimension \((>5)\), such a procedure is very laborious.

Let now the space \(X=\{x=(d_1,\ldots,d_m)\}\), where \(d_i=0,1,\ldots\), be the values of \(m\) mutually independent random variables having Poisson distributions with parameters \(\lambda_1,\lambda_2,\ldots,\lambda_m\). The space

\[ \Theta=\{\theta\}=\{(\lambda_1,\ldots,\lambda_m),\ \lambda_i\geq 0\};\qquad f(\theta)=\sum_{i=1}^{m} f_i(\lambda_i), \]

where \(f_i(\lambda_i)\) are concave functions, \(f_i(0)=0\). It is required, from the observed value \(x=(d_1,\ldots,d_m)\), to construct an upper bound \(\overline f(x)\) of a \(\gamma\)-i. for \(f(\theta)\). The value \(\underline f(x)\) is set equal to 0.

Theorem 2. The following systems of sets are \(\gamma\)-c.s.: system I \(\mathfrak P_\gamma=\{\mathfrak P_{\gamma,d_1,\ldots,d_m}\}\); system II \(\mathfrak Q_\gamma=\{\mathfrak Q_{\gamma,d_1,\ldots,d_m}\}\); system III \(\mathfrak R_\gamma=\{\mathfrak R_{\gamma,d_1,\ldots,d_m}\}\):

\[ \mathfrak P_{\gamma,d_1,\ldots,d_m} =\{\lambda_i:\ 0\leq \lambda_i\leq \Delta_{1-\gamma_0}(d_i),\ i=1,\ldots,m\};\qquad \gamma_0^m=\gamma; \tag{3} \]

\[ \mathfrak Q_{\gamma,d_1,\ldots,d_m} =\left\{\lambda_i:\ \sum_{i=1}^{m}\lambda_1\leq \Delta_{1-\gamma}\left(\sum_{i=1}^{m}d_i\right)\right\}; \tag{4} \]

\[ \mathfrak R_{\gamma,d_1,\ldots,d_m} =\mathfrak P_{\gamma_1d_1,\ldots,d_m}\cap \mathfrak Q_{\gamma_2d_1,\ldots,d_m},\qquad \gamma=\gamma_1+\gamma_2-1. \tag{5} \]

Here \(\Delta_\alpha(d)\) is the solution of the transcendental equation

\[ \sum_{k=0}^{d}\frac{[\Delta_\alpha(d)]^k}{k!}\,e^{-\Delta_\alpha(d)}=\alpha, \]

tables of the values \(\Delta_\alpha(d)\) are given in [2].

From Theorems 1 and 2 and the concavity property of \(f(\theta)\), one can obtain the following assertion.

Theorem 3. The upper bounds \(\overline f(d_1,\ldots,d_m)\) of the \(\gamma\)-i., constructed on the basis of the \(\gamma\)-c.s. \(\mathfrak P_\gamma\) and \(\mathfrak Q_\gamma\), have the form

\[ \overline f_{\mathrm I}(d_1,\ldots,d_m) =\sum_{i=1}^{m} f_i\bigl(\Delta_{1-\gamma_0}(d_i)\bigr),\qquad \overline f_{\mathrm{II}}(d_1,\ldots,d_m)= \]

\[ =\max_{i=1,\ldots,m} \left\{f_i\left(\Delta_{1-\gamma}\left(\sum_{i=1}^{m}d_i\right)\right)\right\}. \tag{6} \]

In those cases where many \(d_i=0\), the \(\gamma\)-c.s. \(\mathfrak Q_\gamma\) gives better results than the \(\gamma\)-c.s. \(\mathfrak P_\gamma\). As many of the \(d_i\) increase, the \(\gamma\)-c.s. \(\mathfrak Q_\gamma\) loses its advantage. Numerical calculations show that in most cases the use of the \(\gamma\)-c.s. \(\mathfrak R_\gamma\) gives better results. The confidence sets \(\mathfrak R_\gamma\) are polyhedra in \(m\)-dimensional space, as follows from formulas (3)—(5).

Theorem 4. The coordinates of the vertices \(O_{\mathfrak R}\) of the polyhedron \(\mathfrak R_{\gamma d_1,\ldots,d_m}\), at which the maximum of the function \(f(\theta)=\sum_{i=1}^{m}f_i(\lambda_i)\) is attained, are specified as follows. To each such vertex there corresponds a set \(S\cup i_0\)

indices \(i\) for which

\[ \sum_{i\in S}\Delta_{1-\gamma_0}(d_i)\leq \Delta_{1-\gamma_1}\left(\sum_{i=1}^{m}d_i\right)< \sum_{i\in S}\Delta_{1-\gamma_0}(d_i)+\max_{j\in S}\Delta_{1-\gamma_0}(d_j), \]

\[ \gamma_0^m=\gamma_2,\qquad \gamma=\gamma_1+\gamma_2-1. \]

We set the values of the coordinates of the vertex \(\lambda_i,\ i\in S\), equal to \(\Delta_{1-\gamma_0}(d_i)\). The remaining coordinates \(\lambda_i,\ i\in S\), except for one \(i_0\in S\), are set equal to zero, and the value \(\lambda_{i_0}=x\), where \(x\) is the solution of the equation

\[ \sum_{i\in S}\Delta_{1-\gamma_0}(d_i)+x= \Delta_{1-\gamma_1}\left(\sum_{i=1}^{m}d_i\right). \tag{7} \]

The number of vertices indicated in Theorem 4 may be too large for large values of \(m\). We shall call a set of integers \(\mathscr L=(l_0,\ldots,l_k)\) admissible if

\[ 0\leq l_d\leq \min\left\{m_d,\, \frac{\Delta_{1-\gamma_1}\left(\sum_{i=1}^{m}d_i\right)} {\Delta_{1-\gamma_0}(d)} \right\},\quad d=0,1,\ldots,k, \]

and if the inequalities

\[ \sum_{d=0}^{k}l_d\Delta_{1-\gamma_0}(d)\leq \Delta_{1-\gamma_1}\left(\sum_{i=1}^{m}d_i\right)< \sum_{d=0}^{k}l_d\Delta_{1-\gamma_0}(d)+ \max_{d:m_d-l_d>0}\Delta_{1-\gamma_0}(d) \]

are satisfied.

We shall say that the sets of vertices \(\Gamma_{\mathscr L}\) of the polyhedron \(\mathscr R_{\gamma d_1,\ldots,d_m}\) correspond to the admissible set \(\mathscr L\), if, for the sets \(S\) determining the coordinates of these vertices, the number of indices \(i\in S\cap I_d\) is equal to \(l_d\), \(d=0,1,\ldots,k\). Here \(I_d=\{i:d_i=d\}\).

We denote the solution \(x\) of equation (7) by \(x_{\mathscr L}\).

It follows from Theorem 4 that the vertex at which the absolute maximum \(f(\theta)\) is attained belongs to one of the sets \(\Gamma_{\mathscr L}\). The maximum of \(f(\theta)\), taken over the set of vertices \(\Gamma_{\mathscr L}\), is found as follows. Form the sets \(F_d\subseteq I_d\subseteq(1,\ldots,m)\), \(d=0,1,\ldots,k\). \(F_d\) contains \(l_d\) indices \(i\in I_d\) for which \(a_{i d}=f_i(\Delta_{1-\gamma_0}(d))\geq a_{j d},\ j\in I_d\setminus F_d=E_d\). If for the value \(d\), \(x_{\mathscr L}>\Delta_{1-\gamma_0}(d)\), then the following arguments are applied to the index \(d+1\). If, however, \(x_{\mathscr L}<\Delta_{1-\gamma_0}(d)\), then the values \(b_{i,\mathscr L}=f_i(x_{\mathscr L})\) are found, and the value \(k_d\), \(b_{k_d\mathscr L}=\max_{i\in E_d} b_{i,\mathscr L}\). In this case the following cases are possible:

1) the inequality holds
\[ (a_{i,d}+b_{k_d,\mathscr L})\geq(a_{k_d,d}+b_{i,\mathscr L}),\quad i\in F_d; \tag{8} \]

2) there is an \(i_d\in F_d\) such that
\[ \max_{i\in F_d}\left[a_{k_d,d}+b_{i,\mathscr L}-a_{i,d}-b_{k_d,\mathscr L}\right]>0, \tag{9} \]

the maximum in (9) being attained at \(i=i_d\).

Case 2) is divided into two subcases depending on the fulfillment of one of the inequalities

\[ a_{k_d,d}\geq a_{i,d},\quad i\in E_d, \tag{10} \]

\[ a_{k_d,d}<a_{j_d,d}=\max_{j\in E_d\setminus k}a_{j,d}. \tag{11} \]

When (8) is fulfilled, the value

\[ \varphi^{(1)}_{d,\mathscr L}=\sum' a_{i,c}+b_{k_d,\mathscr L} \quad \text{when (9) and (10) are fulfilled,} \]

\[ \varphi^{(2)}_{d,\mathscr L}=\sum'' a_{i,c}+b_{k_d,\mathscr L} \quad \text{when (9) and (11) are fulfilled;} \]

\[ \varphi^{(3)}_{d,\mathscr L}=\sum''' a_{i,c}+b_{i_d,\mathscr L}, \]
where \(\sum'\) is taken over values \(i\in \displaystyle\bigcup_{c=0}^{k}F_c\), \(\sum''\) over
\[ i\in\left(\bigcup_{c\ne d}F_c\right)\cup(F_d\setminus i_d)\cup k_d \]
and \(\sum'''\) over
\[ i\in\left(\bigcup_{c\ne d}F_c\right)\cup(F_d\setminus i_d)\cup j_d. \]

Theorem 5.

\[ \max_{O_{\mathscr R}\in\Gamma_{\mathscr L}} f(\theta) = \max_{0\le d\le k}\{\varphi^{(i_d)}_{d,\mathscr L}\}, \]

where \(i_d=1\) if (8) holds, \(i_d=2\) if (9) and (10) hold, and \(i_d=3\) if (9) and (11) hold.

Thus, finding the absolute maximum of \(f(\theta)\) is in fact, in complexity, equivalent to examining all admissible sets, whose number is small for small \(D\). To compare the systems \(\mathscr P_\gamma, \mathscr Q_\gamma, \mathscr R_\gamma\), consider the function
\[ P=\prod_{i=1}^{20}(1-(1-p_i)^2). \]
For each \(p_i\), binomial trials of size \(N_i\) were carried out. We assume that the numbers of “failures” \(d_i\) are small, so that Poisson approximations with \(\lambda_i=N_i(1-p_i)\) may be used.

Using Theorems 3–5 for
\[ \ln P=\sum_{i=1}^{20} f_i(\lambda_i),\qquad f_i(\lambda_i)=\ln(1-(\lambda_i/N_i)^2), \]
when \(m_0=5,\ m_1=15\), we find: for the system \(\mathscr P_{0.9}\), \(\underline P=0.907\); for \(\mathscr Q_{0.9}\), \(\underline P=0.955\); for \(\mathscr R_{0.9}\), \(\underline P=0.984\). The advantages of the last method are substantial. Similar results were also obtained for other values of \(m_i\).

Moscow State University
named after M. V. Lomonosov

Received
21 X 1965

CITED LITERATURE

\(^{1}\) R. A. Mirnyi, A. D. Solov’ev, in: Cybernetics in the Service of Communism, 2, 1964, pp. 213–218.
\(^{2}\) B. V. Gnedenko, Yu. K. Belyaev, A. D. Solov’ev, Mathematical Methods of Reliability Theory, “Nauka,” 1965.
\(^{3}\) R. Cramér, Mathematical Methods of Statistics, IL, 1948.
\(^{4}\) A. N. Kolmogorov, Izv. Akad. Nauk SSSR, Ser. Mat., 14, No. 4, 303 (1950).

Submission history

Reports of the Academy of Sciences of the USSR