Abstract
Full Text
MATHEMATICS
WOLFGANG RICHTER
LOCAL LIMIT THEOREMS FOR LARGE DEVIATIONS
(Presented by Academician I. M. Vinogradov on 28 I 1957)
- Let there be a sequence of independent random variables \(X_1, X_2, \ldots\) with distribution functions \(V_1(x), V_2(x), \ldots\). Suppose that their variances exist, \(D X_j=\sigma_j^2\), \(\sum_{j=1}^n \sigma_j^2=s_n^2\). Their mathematical expectations may, without loss of generality, be taken equal to zero.
\(M_j(z)=E e^{zX_j}=\int_{-\infty}^{\infty} e^{zx}\,dV_j(x)\) is called the moment-generating function of the variable \(X_j\). Put
\[ z_n=\sum_{j=1}^{n} X_j/s_n . \]
Let us consider the limiting behavior of the function \(P\{Z_n<x\}=F_n(x)\) and of its derivative, if it exists, as \(n\to\infty\).
- In the present note a series of local limit theorems is given in the case of large deviations, i.e., for \(x\) increasing without bound together with \(n\). As is known, analogous (local and integral) theorems, established under the assumption \(x=O(1)\) as \(n\to\infty\), give only a trivial answer to the question of the behavior of probabilities in the case of large deviations. Several integral theorems are already known which partially solve the problem of large deviations.
For identically distributed variables there is a theorem of H. Cramér \((^1)\), estimating the ratio
\[
\frac{1-F_n(x)}{1-\Phi(x)},
\]
where \(\Phi(x)\) is the normal distribution function. In \((^1)\) \(x\) has order of growth \(x=o\!\left(\frac{\sqrt n}{\ln n}\right)\). W. Feller \((^3)\) considered the same problem for the case of non-identically distributed variables.
V. V. Petrov \((^4)\) succeeded in generalizing Cramér’s theorem to the case of non-identically distributed random variables, while improving the remainder term of the asymptotic expression and replacing the order of growth
\[
x=o\!\left(\frac{\sqrt n}{\ln n}\right)
\]
by \(x=o(\sqrt n)\).
In all the above-mentioned works one very strong assumption is made, namely, the analyticity is assumed of the moment-generating functions of the summands \(X_j\) in a neighborhood, common for all, of the point \(z=0\). A certain transformation of the given probability laws is applied, which, in essence, reduces to the introduction of “conjugate” probability distributions (in the sense of Khintchine \((^5)\)). Why precisely such a transformation leads to some success in solving the problem of large deviations is not explained in these works.
It turns out that this transformation is, in essence, a hidden application of the saddle-point method from the theory of functions of a complex vari-
G. Daniels in paper \((^2)\) pointed out the possible significance of this method for obtaining asymptotic expressions in probability theory and mathematical statistics. By the same method he obtained an asymptotic expression for the density of the arithmetic mean of identically distributed independent random variables under the condition \(x=O(1)\).
The proofs of the following theorems are, in their structure, analogous to the proof of Cramér’s theorem, but here the saddle-point method is applied consistently. This gives local theorems for identically distributed variables both in the case where a density exists and for lattice summands. A local theorem is also given for non-identically distributed variables, with the assumption of the strong condition C (Theorem 1). One particular case is indicated in which condition C is formulated purely in terms of the distribution functions themselves.
- We pass to the results.
Theorem 1. We shall assume that the following conditions are satisfied:
Condition A. There exist positive numbers \(A, K\), and \(k\) such that in the circle \(|z|<A\) the inequalities hold
\[
k \leq \left| \int_{-\infty}^{\infty} e^{zy}\,dV_j(y) \right| \leq K,\qquad j=1,2,\ldots,
\]
Condition B. For all \(n\)
\[
\frac{s_n^2}{n}\geq \delta>0,
\]
where \(s_n^2\) is the variance of \(\sum_{1}^{n} X_j\).
Condition C. There exists a subsequence \(X_{j_1}, X_{j_2},\ldots\) of the sequence \(X_1,X_2,\ldots\) such that, for the moment-generating functions of \(X_{j_k}\), the uniform estimate holds
\[
\left| \int_{-\infty}^{\infty} e^{(v+it)y}\,dV_{j_k}(y) \right| \leq \frac{L}{|t|^\beta}
\tag{*}
\]
for \(|t|>N\) and \(|v|<A\), where \(L, N\), and \(\beta>0\) may depend on \(v\), and the number \(n^*\) of the terms \(X_{j_k}\) among the first \(n\) variables \(X_1,X_2,\ldots,X_n\) satisfies the condition
\[
\lim_{n\to\infty} \frac{n^*}{s_n^\delta}>0
\]
for some \(\delta>0\).
Let \(p_{Z_n}(x)\) denote the density of the distribution of the random variable \(Z_n\). It exists for all sufficiently large \(n\).
Then, if \(x>1\) and \(x=o(\sqrt n)\) as \(n\to\infty\), the relation holds
\[
\frac{p_{Z_n}(x)}{\varphi(x)}
=
\exp\left[
\frac{x^3}{\sqrt n}\lambda_n\left(\frac{x}{\sqrt n}\right)
\right]
\left[
1+O\left(\frac{x}{\sqrt n}\right)
\right],
\]
where \(\lambda_n(t)\) is a power series converging for sufficiently small values of \(|t|\) uniformly for all \(n\), and \(\varphi(x)=e^{-x^2/2}/\sqrt{2\pi}\).
We note that the uniform estimate \((*)\) in condition C is satisfied if, for example, the distribution functions of the variables \(X_{j_k}\) are absolutely continuous...
functions and their derivatives \(p_{jk}(x)\), as well as the functions \(e^{vx}p_{jk}(x)\) \((-A<v<A)\), have uniformly (for each \(v\)) bounded total variations, i.e. the sequence
\[ \int_{-\infty}^{\infty} e^{vx}\, |dp_{jk}(x)| \]
is uniformly bounded for each \(v\), \(|v|<A\).
If the variables \(X_1, X_2,\ldots\) are identically distributed, then condition C can be considerably weakened.
Theorem 2. Let the independent variables \(X_1, X_2,\ldots\), with common distribution function \(V(x)\), \(EX_j=0,\ DX_j=\sigma^2>0,\ j=1,2,\ldots\), satisfy the following conditions:
Condition 1. There exists a positive number \(A\) such that for any real number \(s\) \((|s|<A)\) the integral
\[ \int_{-\infty}^{\infty} e^{sy}\, dV(y) \]
converges.
Condition 2. There exists an \(n_0\) such that the distribution function of the sum \(X_1+\cdots+X_{n_0}\) is absolutely continuous and its derivative is bounded.
Then, if \(x>1\) and \(x=o(\sqrt n)\), as \(n\to\infty\) one has
\[ \frac{p_{Z_n}(x)}{\varphi(x)} = \exp\left[ \frac{x^3}{\sqrt n}\lambda\left(\frac{x}{\sqrt n}\right) \right] \left[ 1+O\left(\frac{x}{\sqrt n}\right) \right], \]
where \(\lambda(t)\) is a power series converging for sufficiently small values of \(|t|\).
Analogously, a local theorem also holds for identically distributed lattice summands \(X_1,X_2,\ldots\) with maximal span \(h\).
Theorem 3. Let there be given a sequence \(X_1,X_2,\ldots\) of independent random variables, \(EX_j=0,\ DX_j=\sigma^2>0\), which take only values from a certain arithmetic progression with positive probability \(p_k=P\{X_j=a+kh\}\), \(h\) being the maximal span of the distribution, \(k\) an integer, and \(a\) some real number. Put
\[ \mathscr P_n(k)=P\left\{\sum_1^n X_j=an+kh\right\}. \]
Let the sequence \(X_1,X_2,\ldots\) satisfy condition 1 of theorem 2. Put
\[ x_{nk}=\frac{an+kh}{\sigma\sqrt n}. \]
Then, if \(x_{nk}>1\) and \(x_{nk}=o(\sqrt n)\),
\[ \frac{\dfrac{\sigma\sqrt n}{h}\mathscr P_n(k)}{\varphi(x_{nk})} = \exp\left[ \frac{x_{nk}^3}{\sqrt n} \lambda\left(\frac{x_{nk}}{\sqrt n}\right) \right] \left[ 1+O\left(\frac{x_{nk}}{\sqrt n}\right) \right], \]
where \(\lambda(t)\) is a power series converging for sufficiently small values of \(|t|\).
Entirely analogous asymptotic expressions can be obtained if one admits \(x<-1\) and \(|x|=o(\sqrt n)\). It is enough to replace, in the limiting expressions of theorems 1–3, \(x\) by \(|x|\) in the remainder term.
It should be noted that the series \(\lambda(t)\) and, respectively, \(\lambda_n(t)\) appearing in these theorems are the same as in the papers \((^1,^4)\).
- From theorems 1–3 there follows a number of simple consequences. We indicate the most interesting consequences of theorem 1.
Corollary 1. Under the conditions of Theorem 1:
a) If \(x>1,\ x=O(n^{1/6})\) as \(n\to\infty\), then
\[ \frac{p_{Z_n}(x)}{\varphi(x)} = e^{c_n x^3} \left[ 1+O\left(\frac{x}{\sqrt n}\right) \right], \]
where
\[ c_n=\frac{1}{6s_n^3}\sum_{j=1}^n \alpha_{3j} \]
(\(\alpha_{3j}\) is the third moment of the random variable \(X_j\)) has order
\[ O\left(\frac{1}{\sqrt n}\right) \]
as \(n\to\infty\).
b) If \(x>1,\ x=o(n^{1/6})\) as \(n\to\infty\), then
\[ \lim_{n\to\infty} p_{Z_n}(x)=\varphi(x). \]
Corollary 2. Under the conditions of Theorem 1, for \(x=o(\sqrt n),\ x>1\), we obtain
\[ \frac{x}{\sqrt n}\lambda_n\left(\frac{x}{\sqrt n}\right)=o(1), \]
and consequently,
\[ \ln p_{Z_n}(x)=\ln\varphi(x)[1+o(1)]. \]
But if \(x=O(\sqrt n)\), then
\[ \ln p_{Z_n}(x)=\frac{C}{2}x^2[1+o(1)] \]
and \(C\ne1\); \(\ln p_{Z_n}(x)\) is then no longer asymptotically proportional to \(\ln\varphi(x)\).
Corollary 3. From the proof of Theorem 1 itself it becomes clear that here too an analogue of Theorem 2 of V. V. Petrov\({}^4\) holds. The only requirement is that \(x/\sqrt n\), for large \(n\), remains in modulus less than some small quantity \(\varepsilon_0>0\). Choosing a sufficiently small constant \(\varepsilon_0\), we may assert that in the interval \(1<x<\varepsilon_0\sqrt n\) one has
\[ \frac{p_{Z_n}(x)}{\varphi(x)} = \exp\left[ \frac{x^3}{\sqrt n} \lambda_n\left(\frac{x}{\sqrt n}\right) \right][1+r\varepsilon_0], \]
where \(|r|\) is less than some constant \(\gamma\).
I express my gratitude to Corresponding Member of the Academy of Sciences of the USSR Yu. V. Linnik for posing the problem and for valuable suggestions.
Leningrad State University
named after A. A. Zhdanov
Received
21 XII 1956
CITED LITERATURE
- H. Cramér, Usp. matem. nauk, 10, 166 (1944).
- H. E. Daniels, Ann. Math. Statist., 25, 631 (1954).
- W. Feller, Trans. Am. Math. Soc., 54, 361 (1943).
- V. V. Petrov, Usp. matem. nauk, 9, no. 4, 195 (1954).
- A. Ya. Khinchin, Mathematical Foundations of Statistical Mechanics, Moscow–Leningrad, 1943.