Abstract
Full Text
MATHEMATICS
Yu. V. PROKHOROV
ON RANDOM MEASURES ON A COMPACT SPACE
(Presented by Academician A. N. Kolmogorov, 1 XII 1960)
Introduction. It is known that extending the method of characteristic functions to the infinite-dimensional case encounters great difficulties, and simple formulations of results are rather rare. Therefore every case in which formulations of this kind can be obtained deserves attention. Below we consider distributions in the space of measures on a compact space—a case very special from the standpoint of general theory, but of interest from the standpoint of possible applications. The notation and terminology are the same as in the survey (¹).
§ 1. Let \((E,\mathscr G)\) be a compact* Hausdorff topological space (\(E\) is the set of points, \(\mathscr G\) the class of open sets); \(Y\) the set of continuous functions on \(E\); \(\mathscr H\) the topology in \(Y\) generated by the “uniform” norm: \(\|y\|=\sup_e |y(e)|\); and \(X\) the space conjugate to \((Y,\mathscr H)\). In accordance with the well-known Riesz theorem, every element \(x\in X\) is represented uniquely in the form
\[ x(y)=\int_E y(e)\,\mu_x(de), \]
where \(\mu_x\) is a generalized Borel measure on \(E\) and \(\operatorname{Var}\mu_x=\|x\|\).
Endow \(X\) with the weak topology \(\mathscr T_s\). Then the conjugate of \((X,\mathscr T_s)\) is \(Y\) itself. It can be shown that the \(\sigma\)-algebra \(\mathscr L\) generated by the “cylindrical” sets
\[ A=\{x:(x(y_1),\ldots,x(y_n))\in A_n\} \]
(\(y_1,\ldots,y_n\in Y\), \(A_n\) an \(n\)-dimensional Borel set) coincides with the \(\sigma\)-algebra \(\mathscr B_s\) of \(\mathscr T_s\)-Borel sets. Every distribution \(P\) on \(\mathscr L=\mathscr B_s\) is tight, i.e., for any \(\varepsilon>0\) there exists a compact set \(K_\varepsilon\) such that \(P^*(K_\varepsilon)>1-\varepsilon\), where \(P^*\) is the outer measure induced by \(P\). Therefore (¹) every distribution \(P\) can be extended, and moreover uniquely, to a tight Borel distribution, i.e., a distribution defined on the \(\sigma\)-algebra generated by \(\mathscr G\). Henceforth we shall therefore consider only tight Borel distributions. Denote their totality by \(\mathfrak P\). Each \(P\in\mathfrak P\) is uniquely determined by its characteristic functional (c.f.):
\[ \chi(y,P)=\int e^{ix(y)}P(dx),\qquad y\in Y. \tag{1} \]
Consider the family \(\mathfrak P^+\) of distributions concentrated on the cone \(X^+\) of nonnegative \(x\). Note that \(X^+\) is \(\mathscr T_s\)-closed. Hence it is not difficult to infer that \(\mathfrak P^+\) is a weakly closed subset of \(\mathfrak P\).
Denote by \(Y^+\) the set of nonnegative \(y\in Y\). An arbitrary element \(y\in Y\) can be represented as the difference of two strictly positive functions \(y_1^+\) and \(y_2^+\), for example by setting,
* In the sense of “bicompact.”
\[ \begin{aligned} y_1^+(e)&=y(e)+1 &&\text{if } y(e)\ge 0; \qquad &y_1^+(e)&=1 &&\text{in the other cases;}\\ y_2^+(e)&=-y(e)+1 &&\text{if } y(e)<0; \qquad &y_1^+(e)&=1 &&\text{in the other cases.} \end{aligned} \tag{2} \]
Theorem 1. In order that the functional \(\chi(y)\), equal to one at zero, be the characteristic functional of some distribution \(P\in\mathcal P^+\), the following conditions are necessary and sufficient: 1) \(\chi(y)\) is nonnegative definite; 2) for every \(y^+\in Y^+\) the function \(\chi(zy^+)\), \(z=\sigma+i\tau\), is analytic for \(\tau>0\), continuous, and in modulus does not exceed one for \(\tau\ge 0\).
Proof. The necessity of the stated conditions is obvious. We shall prove their sufficiency.
\(1^\circ\). Let \(y\in Y\). Represent \(y\) as the difference \(y_1^+-y_2^+\), in accordance with (2). The function
\[
\chi(t_1,t_2)=\chi(t_1y_1^+ + t_2y_2^+)
\]
is nonnegative definite and is continuous at zero along the coordinate axes. By Lemma 2 of [1], \(\chi(t_1,t_2)\) is a two-dimensional characteristic function. Therefore \(\chi(ty)=\chi(t,-t)\) is continuous as a function of the real argument \(t\) for every \(y\in Y\). By Theorem 2 of [1], the functional \(\chi(y)\) is the characteristic functional of some weak distribution \(P\) on the algebra \(\mathcal A\) of cylinder sets.
\(2^\circ\). Let \(A\in\mathcal A\) be such a cylinder set that \(A\cap X^+=\varnothing\). Then \(P(A)=0\). Indeed, the distribution function \(P\{x:x(y^+)<\alpha\}\) has characteristic function \(\chi(ty^+)\) and, by condition 2 of the theorem, is equal to zero on the negative half-axis. Thus
\[
P\{x:x(y^+)\ge 0\}=1.
\tag{3}
\]
Suppose that \(A\) has the form
\[
A=\{x:(x(y_1),\ldots,x(y_n))\in A_n\}.
\]
Representing each \(y_j\) as a difference \(y_j=y_{j,1}^+-y_{j,2}^+\), we obtain
\[
A=\{x:(x(y_{1,1}^+),\ldots,x(y_{n,2}^+))\in A'_{2n}\}.
\]
The linear combinations of the elements \(y_{1,1}^+,\ldots,y_{n,2}^+\) form a subspace \(Y_1\) of the space \(Y\). Let \(Y_2\) denote the set of linear combinations of the same elements with rational coefficients. With the aid of Krein’s theorem on the extension of positive functionals ([4], p. 63), it is not hard to show that the smallest cylinder set containing \(X^+\) and determined by \(y_{1,1}^+,\ldots,y_{n,2}^+\) has the form
\[
B=\bigcap_{y\in Y^+\cap Y_1}\{x:x(y)\ge 0\}
=
\bigcap_{y\in Y^+\cap Y_2}\{x:x(y)\ge 0\},
\]
and, in accordance with (3), \(P(B)=1\). Now from \(A\cap X^+=\varnothing\) it follows that \(A\cap B=\varnothing\) and \(P(A)=0\), as was required to prove.
\(3^\circ\). Let \(\varepsilon\) be an arbitrary positive number; let \(y_0(e)\) be the function identically equal to one. From the continuity of \(\chi(ty_0)\) as a function of \(t\) it follows that there exists a number \(A_\varepsilon\) such that
\[
P\{x:|x(y_0)|\le A_\varepsilon\}>1-\varepsilon.
\tag{4}
\]
Set
\[
C=\{x:|x(y_0)|\le A_\varepsilon\},\qquad
K=X^+\cap C=X^+\cap\{x:\|x\|\le A_\varepsilon\}.
\]
If \(A\in\mathcal A\) and \(A\cap K=\varnothing\), then \((A\cap C)\cap X^+=\varnothing\). By \(2^\circ\), \(P(A\cap C)=0\), and by (4) \(P(A\cap \bar C)\le \varepsilon\). Thus, for every \(\varepsilon>0\) there exists a compact \(K\) such that from \(A\in\mathcal A\), \(A\cap K=\varnothing\) it follows that \(P(A)\le\varepsilon\), i.e. the weak distribution \(P\) is tight and therefore ([1], § 1) extends uniquely to a Borel measure with characteristic functional \(\chi(y)\), entirely concentrated on \(X^+\). Theorem 1 is proved.
For distributions \(P\in\mathcal P^+\) one may introduce, along with the characteristic functional, also an analogue of the Laplace transform: we put*
\[
L(y^+,P)=\int_{X^+} e^{-x(y^+)}P(dx),\qquad y^+\in Y^+.
\tag{5}
\]
* The abstract Laplace transform is introduced differently in [2].
Theorem 2. Each \(P \in \mathfrak{P}^{+}\) is uniquely determined by its Laplace transform (5).
Proof. Let \(y = y_1^{+} - y_2^{+}\) be an arbitrary element of \(Y\), and let \(s_1, s_2\) be arbitrary nonnegative numbers. The joint distribution \(P^{\xi_1,\xi_2}\) of the random variables \(\xi_1 = x(y_1^{+})\) and \(\xi_2 = x(y_2^{+})\) is uniquely determined by its Laplace transform
\[ L(s_1,s_2)=L(s_1y_1^{+}+s_2y_2^{+},P) =\int_0^\infty\int_0^\infty e^{-s_1u_1-s_2u_2}\,P^{\xi_1,\xi_2}(du_1\times du_2). \]
Therefore the distribution of \(\xi = x(y) = x(y_1^{+}) - x(y_2^{+}) = \xi_1 - \xi_2\) is uniquely determined by \(L(y^{+},P)\). Consequently, \(\chi(y,P)=Me^{i\xi}\), and hence also \(P\), are uniquely determined by \(L(y^{+},P)\), as was required to prove.
§ 2. Theorem 3. For convergence \(P_n \Rightarrow P\) \((P_n \in \mathfrak{P}^{+})\), it is necessary and sufficient that, for every \(y\),
\[ \chi(y,P_n)\to \chi(y,P). \tag{6} \]
Proof. Necessity of (6) is obvious. Sufficiency. Let \(y_0(e)\), as before, be the function identically equal to one. Then, for every \(t\), \(-\infty<t<\infty\),
\[ \chi(ty_0,P_n)\to \chi(ty_0,P). \]
Consequently, the distribution functions
\[ G_n(\alpha)=P_n\{x:x(y_0)<\alpha\} \tag{7} \]
converge to the distribution function
\[ G(\alpha)=P\{x:x(y_0)<\alpha\}. \tag{8} \]
Therefore, for every \(\varepsilon>0\) there exists an \(A_\varepsilon\) such that, for the compact set
\[ K=\{x:x\in X^{+}, |x(y_0)|\le A_\varepsilon\} = X^{+}\cap \{x:\|x\|\le A_\varepsilon\} \]
for all \(n\), \(P_n(K)>1-\varepsilon\). Hence the sequence \(\{P_n\}\) is relatively weakly compact. It follows from (6) that it can have only one limit point, as was required to prove.
Theorem 4. For convergence \(P_n \Rightarrow P\) \((P_n\in \mathfrak{P}^{+})\), it is necessary and sufficient that, for every \(y^{+}\in Y^{+}\),
\[ L(y^{+},P_n)\to L(y^{+},P). \tag{9} \]
Proof. Necessity of (9) is obvious. Sufficiency. In the previous notation, for every \(s\ge 0\) we have
\[ L(sy_0,P_n)\to L(sy_0,P). \]
Hence again follows the convergence of the distribution functions (7) to the distribution function (8), and the proof is completed in the same way as in Theorem 3.
Steklov Mathematical Institute
Academy of Sciences of the USSR
Received
26 IX 1960
References
¹ Yu. V. Prokhorov, Proc. 4-th Berkeley Symposium (in press).
² S. Bochner, Harmonic Analysis and the Theory of Probability, 1955.
³ P. Halmos, Measure Theory, IL, 1953.
⁴ M. A. Naimark, Normed Rings, 1956.