Abstract
Full Text
UDC 518.9
E. B. Dynkin
A Game Variant of the Optimal Stopping Problem
(Presented by Academician A. N. Kolmogorov on 27 VI 1968)
1. Let an increasing sequence of σ-algebras
\(\mathcal F_0 \subseteq \mathcal F_1 \subseteq \cdots \subseteq \mathcal F_n \subseteq \cdots\) be given on a space \(\Omega\), and let functions \(x_n(\omega)\), \(\varphi_n(\omega)\), measurable with respect to \(\mathcal F_n\) \((n=0,1,2,\ldots)\), be given. The process may be stopped by the first player at those times \(n\) when \(\varphi_n>0\), and by the second player at those times when \(\varphi_n<0\). If stopping is carried out at time \(n\), then the second player receives \(x_n\) from the first. In this game the strategies of the first player are Markov times \(\sigma\) subject to the condition: \(\varphi_\sigma>0\) when \(\sigma<\infty\); the strategies of the second player are Markov times \(\tau\), for which \(\varphi_\tau<0\) when \(\tau<\infty\). (A function \(\tau(\omega)\) taking the values \(0,1,\ldots,n,\ldots\) and the value \(+\infty\) is called a Markov time if, for every finite \(n\), \(\{\tau=n\}\in\mathcal F_n\).) Let \(P\) be a probability measure on some σ-algebra \(\mathcal F\) containing \(\mathcal F_n\) for all \(n\), and let \(M\xi\) denote the integral of \(\xi\) with respect to \(P\). The mean payoff corresponding to the strategies \(\sigma,\tau\) is equal to \(M x_{\sigma\wedge\tau}\).*
We shall assume that the following condition is satisfied.
Condition A. \(M\{\sup_n |x_n|\}<\infty\).
Under this condition the existence of the value of the game is proved, ε-optimal strategies are constructed, and some conditions for the existence of optimal strategies are derived.
If \(\varphi_n<0\) everywhere, then we arrive at the optimal stopping problem considered by Snell \((^1)\) and by other authors. Of special interest is also the case \(x_n=g(X_n)\), \(\varphi_n=\Phi(X_n)\), where \((X_t,\mathcal F_t,P_x)\) is some Markov chain. This case has been studied independently by E. B. Fried.
2. Suppose the game begins at time \(n\). Then only strategies \(\sigma\ge n\) and \(\tau\ge n\) are possible. (Throughout the paper it is understood that \(\varphi_\sigma>0\) when \(\sigma<\infty\), and \(\varphi_\tau<0\) when \(\tau<\infty\).) The corresponding mean payoff is
\(g(n,\sigma,\tau)=M(x_{\sigma\wedge\tau}\mid\mathcal F_n)\).
Let \(\xi_\alpha\) be an arbitrary family of measurable functions. We agree to denote by \(\operatorname{Sup}\xi_\alpha\) a measurable function \(\xi\) satisfying the conditions: a) for every \(\alpha\), \(\xi_\alpha\le\xi\) (a.s.)**; b) if \(\xi_\alpha\le\eta\) (a.s.) for every \(\alpha\), then \(\xi\le\eta\) (a.s.). It is proved (see \((^1,^2)\)) that such a function \(\xi\) exists and is defined uniquely up to an a.s. zero summand, and that one can choose a sequence of values \(\alpha_m\) so that
\(\operatorname{Sup}\xi_\alpha=\sup_m \xi_{\alpha_m}\) (a.s.). Put \(\operatorname{Inf}\xi_\alpha=-\operatorname{sup}(-\xi_\alpha)\).
The upper and lower values of the game are defined by the formulas
\[ \bar w_n=\operatorname{Inf}_{\sigma\ge n}\operatorname{Sup}_{\tau\ge n} g(n,\sigma,\tau), \qquad \underline w_n=\operatorname{Sup}_{\tau\ge n}\operatorname{Inf}_{\sigma\ge n} g(n,\sigma,\tau). \tag{1} \]
The main results of the paper are contained in the following two theorems.
Theorem 1. Almost surely \(\underline w_n=\bar w_n\), and the value of the game
\(w=w_n=\bar w_n\) satisfies the equation
\[ w_n= \begin{cases} M(w_{n+1}\mid\mathcal F_n)\wedge x_n, & (\text{a.s. } \varphi_n>0)***,\\ M(w_{n+1}\mid\mathcal F_n)\vee x_n, & (\text{a.s. } \varphi_n<0),\\ M(w_{n+1}\mid\mathcal F_n), & (\text{a.s. } \varphi_n=0). \end{cases} \tag{2} \]
* We denote the smaller of two numbers \(a,b\) by \(a\wedge b\), and the larger by \(a\vee b\). The value \(x_\infty\) is, by definition, equal to zero.
* “(a.s.)” means almost surely, i.e. for all \(\omega\) except for a set of measure zero.
** “(a.s. \(A\))” means for almost all \(\omega\in A\).
For any $\varepsilon>0$ the strategies
\[
\sigma_\varepsilon^n=\inf\{t:t\geq n,\ \varphi_t>0,\ x_t\leq w_t+\varepsilon\},
\]
\[
\tau_\varepsilon^n=\inf\{t:t\geq n,\ \varphi_t<0,\ x_t\geq w_t-\varepsilon\}
\tag{3}
\]
are $\varepsilon$-optimal in the sense that for any $\sigma\geq n$, $\tau\geq n$
\[ g(n,\sigma,\tau_\varepsilon^n)+\varepsilon\geq w_n\geq g(n,\sigma_\varepsilon^n,\tau)-\varepsilon \quad \text{(a.s.).} \tag{4} \]
If $\tau_0^n<+\infty$ (a.s.) $(\sigma_0^n<+\infty$ (a.s.)), then the first (respectively the second) of inequalities (4) remains valid for $\varepsilon=0$.
Theorem 2. Let
\[ s_n=\operatorname*{Inf}_{\sigma\geq n} M(x_\sigma\mid \mathcal F_n), \qquad S_n=\operatorname*{Sup}_{\tau\geq n} M(x_\tau\mid \mathcal F_n). \tag{5} \]
Then $S_n$ is the smallest supermartingale satisfying the condition
\[ S_n\geq x_n\ \text{(a.s. } \varphi_n<0),\qquad S_n\geq 0\ \text{(a.s.)}; \tag{6} \]
$s_n$ is the largest submartingale satisfying the condition
\[ s_n\leq x_n\ \text{(a.s. } \varphi_n>0),\qquad s_n\leq 0\ \text{(a.s.).} \tag{7} \]
If with probability 1 there exists $m\geq n$ for which $S_m=x_m$, $\varphi_m<0$ ($s_m=x_m$, $\varphi_m>0$), then the first (respectively the second) of inequalities (4) holds for $\varepsilon=0$.
- Put
\[ \bar g(n,\sigma)=\operatorname*{Sup}_{\tau\geq n} g(n,\sigma,\tau). \tag{8} \]
Obviously,
\[ \bar w_n=\operatorname*{Inf}_{\sigma\geq n} \bar g(n,\sigma). \tag{9} \]
Note that
\[
\bar w_n\leq \bar g(n,n)=x_n\quad \text{(a.s. } \varphi_n>0),
\]
\[
\bar w_n\geq \operatorname*{Inf}_{\delta\geq n} g(n,\sigma,n)=x_n\quad \text{(a.s. } \varphi_n<0).
\tag{10}
\]
Let $\sigma\geq n$, $\tau\geq n$. Put $\sigma'=\sigma\vee(n+1)$, $\tau'=\tau\vee(n+1)$. It is not difficult to verify that
\[ g(n,\sigma,\tau)=M\{g(n+1,\sigma',\tau')\mid \mathcal F_n\} \quad \text{(a.s. } \sigma>n,\ \tau>n); \tag{11} \]
\[ g(n,\sigma,\tau)=x_n \quad \text{(a.s. } \{\sigma=n\}\cup\{\tau=n\}). \tag{12} \]
- For each $n$ one can choose a sequence of Markov times $\sigma_m\geq n+1$ such that $\bar w_{n+1}=\inf_m \bar g(n+1,\sigma_m)$ (a.s.). Fix $\varepsilon>0$ and denote by $\nu$ the least value of $m$ for which
\[ \bar g(n+1,\sigma_m)\leq \bar w_{n+1}+\varepsilon. \tag{13} \]
It is easy to see that $\sigma_\nu$ is a Markov time and $\sigma_\nu\geq n+1$. By virtue of (8) and (13), for any $\tau\geq n+1$ and any $m$
\[ g(n+1,\sigma_m,\tau)\leq \bar g(n+1,\sigma_m)\leq \bar w_{n+1}+\varepsilon \quad \text{(a.s. } \nu=m). \]
Therefore
\[ g(n+1,\sigma_\nu,\tau)=M(x_{\sigma_\nu\wedge\tau}\mid \mathcal F_{n+1}) =\sum_m M\{\chi_{\nu=m}x_{\sigma_m\wedge\tau}\mid \mathcal F_{n+1}\}= \]
\[ =\sum_m \chi_{\nu=m}g(n+1,\sigma_m,\tau)\leq \bar w_{n+1}+\varepsilon \quad \text{(a.s.).} \]
By virtue of (11)
\[ g(n,\sigma_v,\tau)=M\{g(n+1,\sigma_v,\tau')\mid \mathcal F_n\}\leq M(\overline w_{n+1}\mid \mathcal F_n)+\varepsilon \quad \text{(a.s. } \tau>n). \tag{14} \]
According to (12)
\[ g(n,\sigma_v,\tau)=x_n \quad \text{(a.s. } \tau=n). \tag{15} \]
Since \(\{\varphi_n\geq 0\}\subseteq \{\tau>n\}\), it follows from (14) that
\[ \overline g(n,\sigma_v)\leq M\{\overline w_{n+1}\mid \mathcal F_n\}+\varepsilon \quad \text{(a.s. } \varphi_n\geq 0). \tag{16} \]
Further, from (14) and (15),
\[ \overline g(n,\sigma_v)\leq x_n\vee M(\overline w_{n+1}\mid \mathcal F_n)+\varepsilon \quad \text{(a.s.)}. \tag{17} \]
Taking (9) into account, we conclude from (16) and (17) that
\[ \overline w_n\leq M(\overline w_{n+1}\mid \mathcal F_n) \quad \text{(a.s. } \varphi_n\geq 0), \tag{18} \]
\[ \overline w_n\leq x_n\vee M(\overline w_{n+1}\mid \mathcal F_n) \quad \text{(a.s.)}. \tag{19} \]
- Let \(\sigma\geq n\). By virtue of (8) one can choose a sequence of Markov times \(\tau_m\geq n+1\) such that
\[ \overline g(n+1,\sigma')=\sup g(n+1,\sigma',\tau_m). \]
Let \(v\) be the smallest value of \(m\) for which
\[
g(n+1,\sigma',\tau_m)\geq \overline g(n+1,\sigma')-\varepsilon.
\]
Then \(\tau_v\) is a Markov time, \(\tau_v\geq n+1\), and
\[
g(n+1,\sigma',\tau_v)\geq \overline g(n+1,\sigma')-\varepsilon
\]
(a.s.). By virtue of (8), (11), and (9),
\[ \overline g(n,\sigma)\geq g(n,\sigma,\tau_v) = M\{g(n+1,\sigma',\tau_v)\mid \mathcal F_n\}\geq \]
\[ \geq M\{\overline g(n+1,\sigma')\mid \mathcal F_n\}-\varepsilon \geq M(\overline w_{n+1}\mid \mathcal F_n)-\varepsilon \quad \text{(a.s. } \sigma>n). \tag{20} \]
By virtue of (12), \(g(n,\sigma,\tau_v)=x_n\) (a.s. \(\sigma=n\)). Therefore
\[ \overline g(n,\sigma)\geq x_n\wedge M(\overline w_{n+1}\mid \mathcal F_n)-\varepsilon \quad \text{(a.s.)}. \tag{21} \]
Since \(\{\varphi_n\leq 0\}\subseteq \{\sigma>n\}\), according to (20)
\[ \overline w_n\geq M(\overline w_{n+1}\mid \mathcal F_n)-\varepsilon \quad \text{(a.s. } \varphi_n\leq 0). \tag{22} \]
In view of the arbitrariness of \(\varepsilon>0\), from (21) and (22) we have
\[ \overline w_n\geq M(\overline w_{n+1}\mid \mathcal F_n) \quad \text{(a.s. } \varphi_n\leq 0); \tag{23} \]
\[ \overline w_n\geq x_n\wedge M(\overline w_{n+1}\mid \mathcal F_n) \quad \text{(a.s.)}. \tag{24} \]
- From (18) and (23) it follows that
\[ \overline w_n=M(\overline w_{n+1}\mid \mathcal F_n) \]
(a.s. \(\varphi_n=0\)). From (9), (18), and (24) we have
\[ \overline w_n=x_n\wedge M(\overline w_{n+1}\mid \mathcal F_n) \]
(a.s. \(\varphi_n>0\)). Finally, from (10), (23), and (19)
\[ \overline w_n=x_n\vee M(\overline w_{n+1}\mid \mathcal F_n) \]
(a.s. \(\varphi_n<0\)). Thus, \(\overline w_n\) satisfies equation (2).
When \(x_n\) is replaced by \(-x_n\) and \(\varphi_n\) by \(-\varphi_n\), the functions \(\underline w_n\) pass into \(-\overline w_n\). Therefore \(\underline w_n\) also satisfies equation (2).
- Fix arbitrary \(\varepsilon>0\) and \(n\), and put
\[ \sigma_\varepsilon=\inf\{t:t\geq n,\ \varphi_t>0,\ x_t\leq \underline w_t+\varepsilon\}, \]
\[ \tau_\varepsilon=\inf\{t:t\geq n,\ \varphi_t<0,\ x_t\geq \overline w_t-\varepsilon\} \tag{25} \]
(if the set is empty, we take its lower bound to be \(+\infty\)). Let
\[
\Lambda_n=\{t:t\geq n,\ \varphi_t<0\},
\]
and let \(\tilde x_n\) be the exact upper bound of the set \(x_t\) \((t\in \Lambda_n)\), augmented by zero. Note that for \(m<n\)
\[ \overline w_n\leq \sup_{\tau\geq n} g(n,\infty,\tau)\leq M(\tilde x_n\mid \mathcal F_n)\leq M(\tilde x_m\mid \mathcal F_n) \quad \text{(a.s.)}. \tag{26} \]
Therefore
\[ \overline{\lim_{n\to\infty}}\,\overline w_n\leq M(\tilde x_m\mid \mathcal F_\infty)=\tilde x_m \quad \text{(a.s.)}, \tag{27} \]
where \(\mathcal F_\infty\) is the minimal \(\sigma\)-algebra containing \(\mathcal F_n\) for all \(n\).
It follows from (27) that
\[ \overline{\lim_{n\to\infty}}\,\bar w_n \leq U \vee 0, \tag{28} \]
where \(U=0\) if \(\Lambda_0\) is finite, and \(U\) is the upper limit of \(x_t\) as \(t\to\infty\) over the set \(\Lambda_0\), if \(\Lambda_0\) is infinite.
Put \(v_t=\bar w_{\tau_\varepsilon\wedge t}\). By virtue of (2),
\(M(\bar w_{t+1}\mid \mathcal F_t)\geq \bar w_t\) (a.s. \(t<\tau_\varepsilon\)). Therefore, for \(t\geq n\),
\[ M(v_{t+1}\mid \mathcal F_t) = \chi_{\tau_\varepsilon>t}M(\bar w_{t+1}\mid \mathcal F_t) + \sum_{m=n}^{t}\chi_{\tau_\varepsilon=m}\bar w_m \geq \]
\[ \geq \chi_{\tau_\varepsilon>t}\bar w_t + \chi_{\tau_\varepsilon\leq t}\bar w_{\tau_\varepsilon} = v_t \quad\text{(a.s.).} \]
Thus \((v_t,\mathcal F_t)\) is a submartingale, and for any \(\sigma\geq n\)
\({}^{\circ}a_1W\wedge{}^{\circ}a\leq (w_\sigma\mid n=v_n=\bar w_n\) (a.s.). By condition A, Fatou’s lemma is applicable to the sequence \(v_{\sigma\wedge t}\). Therefore
\[ M\left\{\overline{\lim_{t\to\infty}}\,v_{\sigma\wedge t}\mid\mathcal F_n\right\} \geq \overline{\lim_{t\to\infty}}\,M(v_{\sigma\wedge t}\mid\mathcal F_n) \geq \bar w_n \quad\text{(a.s.).} \tag{29} \]
Let us note that
\[ \overline{\lim_{t\to\infty}}\,v_{\sigma\wedge t} = \overline{\lim_{t\to\infty}}\,\bar w_{\tau_\varepsilon\wedge\sigma\wedge t} = \bar w_{\tau_\varepsilon\wedge\sigma} \quad\text{when } \tau_\varepsilon\wedge\sigma<\infty . \tag{30} \]
On the other hand, by virtue of (28), if \(\tau_\varepsilon=\infty\), then \(U\leq0\) (a.s.) and
\[ \overline{\lim_{t\to\infty}}\,v_{\sigma\wedge t} \leq \overline{\lim_{n\to\infty}}\,\bar w_n \leq 0 \quad\text{(a.s. } \tau_\varepsilon=\infty). \tag{31} \]
From (29), (30), and (31) we have
\[ M(\bar w_{\tau_\varepsilon\wedge\sigma}\mid\mathcal F_n)\geq \bar w_n \quad\text{(a.s.)} \tag{32} \]
(where we put \(\bar w_\infty=0\)). If \(\tau_0<\infty\) (a.s.), then (32) also holds for \(\varepsilon=0\).
According to (25), \(x_{\tau_\varepsilon}=\bar w_{\tau_\varepsilon}-\varepsilon\), and by virtue of (2) \(x_\sigma\geq \bar w_\sigma\) (a.s.) (for \(\varphi_\sigma>0\)). Therefore \(x_{\sigma\wedge\tau_\varepsilon}\geq \bar w_{\sigma\wedge\tau_\varepsilon}-\varepsilon\) (a.s.), and by virtue of (32)
\[ \bar w_n \leq M(x_{\sigma\wedge\tau_\varepsilon}\mid\mathcal F_n)+\varepsilon = g(n,\sigma,\tau_\varepsilon)+\varepsilon \quad\text{(a.s.).} \tag{33} \]
Replacing \(x_n\) by \(-x_n\) and \(\varphi_n\) by \(-\varphi_n\), we obtain that for any \(\tau\geq n\)
\(w_n\geq g(n,\sigma_\varepsilon,\tau)-\varepsilon\) (a.s.). It is easy to see that \(\bar w_n\geq w_n\) (a.s.). Consequently, for any \(\sigma\geq n\) and \(\tau\geq n\),
\[ g(n,\sigma_\varepsilon,\tau)-\varepsilon \leq w_n \leq \bar w_n \leq g(n,\sigma,\tau_\varepsilon)+\varepsilon \quad\text{(a.s.).} \tag{34} \]
Putting \(\sigma=\sigma_\varepsilon\), \(\tau=\tau_\varepsilon\), we find that
\(0\leq \bar w_n-w_n\leq 2\varepsilon\) (a.s.). Therefore \(w_n=\bar w_n\) (a.s.), and (4) follows from (34). Theorem 1 is proved.
Theorem 2 is proved without difficulty, and we omit its proof.
Moscow State University
named after M. V. Lomonosov
Received
12 VI 1968
REFERENCES
¹ L. Shell, Trans. Am. Math. Soc., 73, 293 (1952).
² J. Neveu, Bases Mathematiques du Calcul des Probabilités, 1964.