Martingale (probability theory)

In probability theory, a martingale is a sequence of random variables (i.e., a stochastic process) for which, at a particular time, the conditional expectation of the next value in the sequence is equal to the present value, regardless of all prior values.

History
Originally, martingale referred to a class of betting strategies that was popular in 18th-century France. The simplest of these strategies was designed for a game in which the gambler wins their stake if a coin comes up heads and loses it if the coin comes up tails. The strategy had the gambler double their bet after every loss so that the first win would recover all previous losses plus win a profit equal to the original stake. As the gambler's wealth and available time jointly approach infinity, their probability of eventually flipping heads approaches 1, which makes the martingale betting strategy seem like a sure thing. However, the exponential growth of the bets eventually bankrupts its users due to finite bankrolls. Stopped Brownian motion, which is a martingale process, can be used to model the trajectory of such games.

The concept of martingale in probability theory was introduced by Paul Lévy in 1934, though he did not name it. The term "martingale" was introduced later by, who also extended the definition to continuous martingales. Much of the original development of the theory was done by Joseph Leo Doob among others. Part of the motivation for that work was to show the impossibility of successful betting strategies in games of chance.

Definitions
A basic definition of a discrete-time martingale is a discrete-time stochastic process (i.e., a sequence of random variables) X1, X2, X3, ... that satisfies for any time n,


 * $$\mathbf{E} ( \vert X_n \vert )< \infty $$


 * $$\mathbf{E} (X_{n+1}\mid X_1,\ldots,X_n)=X_n.$$

That is, the conditional expected value of the next observation, given all the past observations, is equal to the most recent observation.

Martingale sequences with respect to another sequence
More generally, a sequence Y1, Y2, Y3 ... is said to be a martingale with respect to another sequence X1, X2, X3 ... if for all n


 * $$\mathbf{E} ( \vert Y_n \vert )< \infty $$


 * $$\mathbf{E} (Y_{n+1}\mid X_1,\ldots,X_n)=Y_n.$$

Similarly, a continuous-time martingale with respect to the stochastic process Xt is a stochastic process Yt such that for all t


 * $$\mathbf{E} ( \vert Y_t \vert )<\infty $$


 * $$\mathbf{E} ( Y_{t} \mid \{ X_{\tau}, \tau \leq s \} ) = Y_s\quad \forall s \le t.$$

This expresses the property that the conditional expectation of an observation at time t, given all the observations up to time $$ s $$, is equal to the observation at time s (of course, provided that s ≤ t). The second property implies that $$Y_n$$ is measurable with respect to $$X_1 \dots X_n$$.

General definition
In full generality, a stochastic process $$Y:T\times\Omega\to S$$ taking values in a Banach space $$S$$ with norm $$\lVert \cdot \rVert_{S}$$ is a martingale with respect to a filtration $$\Sigma_*$$ and probability measure $$\mathbb P$$ if
 * Σ∗ is a filtration of the underlying probability space (Ω, Σ, $$\mathbb P$$);
 * Y is adapted to the filtration Σ∗, i.e., for each t in the index set T, the random variable Yt is a Σt-measurable function;
 * for each t, Yt lies in the Lp space L1(Ω, Σt, $$\mathbb P$$; S), i.e.
 * $$\mathbf{E}_{\mathbb{P}} (\lVert Y_{t} \rVert_{S}) < + \infty;$$


 * for all s and t with s &lt; t and all F ∈ Σs,
 * $$\mathbf{E}_{\mathbb{P}} \left([Y_t-Y_s]\chi_F\right) =0,$$
 * where χF denotes the indicator function of the event F. In Grimmett and Stirzaker's Probability and Random Processes, this last condition is denoted as
 * $$Y_s = \mathbf{E}_{\mathbb{P}} ( Y_t \mid \Sigma_s ),$$
 * which is a general form of conditional expectation.

It is important to note that the property of being a martingale involves both the filtration and the probability measure (with respect to which the expectations are taken). It is possible that Y could be a martingale with respect to one measure but not another one; the Girsanov theorem offers a way to find a measure with respect to which an Itō process is a martingale.

In the Banach space setting the conditional expectation is also denoted in operator notation as $$\mathbf{E}^{\Sigma_s} Y_t$$.

Examples of martingales

 * An unbiased random walk (in any number of dimensions) is an example of a martingale.
 * A gambler's fortune (capital) is a martingale if all the betting games which the gambler plays are fair. To be more specific: suppose Xn is a gambler's fortune after n tosses of a fair coin, where the gambler wins $1 if the coin comes up heads and loses $1 if it comes up tails. The gambler's conditional expected fortune after the next trial, given the history, is equal to their present fortune. This sequence is thus a martingale.
 * Let Yn = Xn2 − n where Xn is the gambler's fortune from the preceding example. Then the sequence { Yn : n = 1, 2, 3, ... } is a martingale.  This can be used to show that the gambler's total gain or loss varies roughly between plus or minus the square root of the number of steps.
 * (de Moivre's martingale) Now suppose the coin is unfair, i.e., biased, with probability p of coming up heads and probability q = 1 − p of tails.  Let


 * $$X_{n+1}=X_n\pm 1$$
 * with "+" in case of "heads" and "−" in case of "tails". Let


 * $$Y_n=(q/p)^{X_n}.$$


 * Then { Yn : n = 1, 2, 3, ... } is a martingale with respect to { Xn : n = 1, 2, 3, ... }. To show this

\begin{align} E[Y_{n+1} \mid X_1,\dots,X_n] & = p (q/p)^{X_n+1} + q (q/p)^{X_n-1} \\[6pt] & = p (q/p) (q/p)^{X_n} + q (p/q) (q/p)^{X_n} \\[6pt] & = q (q/p)^{X_n} + p (q/p)^{X_n} = (q/p)^{X_n}=Y_n. \end{align} $$
 * Pólya's urn contains a number of different-coloured marbles; at each iteration a marble is randomly selected from the urn and replaced with several more of that same colour. For any given colour, the fraction of marbles in the urn with that colour is a martingale. For example, if currently 95% of the marbles are red then, though the next iteration is more likely to add red marbles than another color, this bias is exactly balanced out by the fact that adding more red marbles alters the fraction much less significantly than adding the same number of non-red marbles would.
 * (Likelihood-ratio testing in statistics) A random variable X is thought to be distributed according either to probability density f or to a different probability density g. A random sample X1, ..., Xn is taken.  Let Yn be the "likelihood ratio"


 * $$Y_n=\prod_{i=1}^n\frac{g(X_i)}{f(X_i)}$$


 * If X is actually distributed according to the density f rather than according to g, then { Yn : n = 1, 2, 3, ... } is a martingale with respect to { Xn : n = 1, 2, 3, ... }.


 * In an ecological community (a group of species that are in a particular trophic level, competing for similar resources in a local area), the number of individuals of any particular species of fixed size is a function of (discrete) time, and may be viewed as a sequence of random variables. This sequence is a martingale under the unified neutral theory of biodiversity and biogeography.
 * If { Nt : t ≥ 0 } is a Poisson process with intensity λ, then the compensated Poisson process { Nt − λt : t ≥ 0 } is a continuous-time martingale with right-continuous/left-limit sample paths


 * Wald's martingale


 * A $$d$$-dimensional process $$M=(M^{(1)},\dots,M^{(d)})$$ in some space $$S^d$$ is a martingale in $$S^d$$ if each component $$T_i(M)=M^{(i)}$$ is a one-dimensional martingale in $$S$$.

Submartingales, supermartingales, and relationship to harmonic functions
There are two popular generalizations of a martingale that also include cases when the current observation Xn is not necessarily equal to the future conditional expectation E[Xn+1 | X1,...,Xn] but instead an upper or lower bound on the conditional expectation. These definitions reflect a relationship between martingale theory and potential theory, which is the study of harmonic functions. Just as a continuous-time martingale satisfies E[Xt | {Xτ : τ ≤ s}] − Xs = 0 ∀s ≤ t, a harmonic function f satisfies the partial differential equation Δf = 0 where Δ is the Laplacian operator. Given a Brownian motion process Wt and a harmonic function f, the resulting process f(Wt) is also a martingale.
 * A discrete-time submartingale is a sequence $$X_1,X_2,X_3,\ldots$$ of integrable random variables satisfying
 * $$\operatorname E[X_{n+1}\mid X_1,\ldots,X_n] \ge X_n.$$
 * Likewise, a continuous-time submartingale satisfies
 * $$\operatorname E[X_t\mid\{X_\tau : \tau \le s\}] \ge X_s \quad \forall s \le t.$$
 * In potential theory, a subharmonic function f satisfies Δf ≥ 0. Any subharmonic function that is bounded above by a harmonic function for all points on the boundary of a ball is bounded above by the harmonic function for all points inside the ball. Similarly, if a submartingale and a martingale have equivalent expectations for a given time, the history of the submartingale tends to be bounded above by the history of the martingale. Roughly speaking, the prefix "sub-" is consistent because the current observation Xn is less than (or equal to) the conditional expectation E[Xn+1 | X1,...,Xn]. Consequently, the current observation provides support from below the future conditional expectation, and the process tends to increase in future time.


 * Analogously, a discrete-time supermartingale satisfies
 * $$\operatorname E[X_{n+1}\mid X_1,\ldots,X_n] \le X_n.$$
 * Likewise, a continuous-time supermartingale satisfies
 * $$\operatorname E[X_t\mid\{X_\tau : \tau \le s\}] \le X_s \quad \forall s \le t.$$
 * In potential theory, a superharmonic function f satisfies Δf ≤ 0. Any superharmonic function that is bounded below by a harmonic function for all points on the boundary of a ball is bounded below by the harmonic function for all points inside the ball. Similarly, if a supermartingale and a martingale have equivalent expectations for a given time, the history of the supermartingale tends to be bounded below by the history of the martingale. Roughly speaking, the prefix "super-" is consistent because the current observation Xn is greater than (or equal to) the conditional expectation E[Xn+1 | X1,...,Xn]. Consequently, the current observation provides support from above the future conditional expectation, and the process tends to decrease in future time.

Examples of submartingales and supermartingales

 * Every martingale is also a submartingale and a supermartingale. Conversely, any stochastic process that is both a submartingale and a supermartingale is a martingale.
 * Consider again the gambler who wins $1 when a coin comes up heads and loses $1 when the coin comes up tails. Suppose now that the coin may be biased, so that it comes up heads with probability p.
 * If p is equal to 1/2, the gambler on average neither wins nor loses money, and the gambler's fortune over time is a martingale.
 * If p is less than 1/2, the gambler loses money on average, and the gambler's fortune over time is a supermartingale.
 * If p is greater than 1/2, the gambler wins money on average, and the gambler's fortune over time is a submartingale.
 * A convex function of a martingale is a submartingale, by Jensen's inequality. For example, the square of the gambler's fortune in the fair coin game is a submartingale (which also follows from the fact that Xn2 − n is a martingale).  Similarly, a concave function of a martingale is a supermartingale.

Martingales and stopping times
A stopping time with respect to a sequence of random variables X1, X2, X3, ... is a random variable τ with the property that for each t, the occurrence or non-occurrence of the event τ = t depends only on the values of X1, X2, X3, ..., Xt. The intuition behind the definition is that at any particular time t, you can look at the sequence so far and tell if it is time to stop. An example in real life might be the time at which a gambler leaves the gambling table, which might be a function of their previous winnings (for example, he might leave only when he goes broke), but he can't choose to go or stay based on the outcome of games that haven't been played yet.

In some contexts the concept of stopping time is defined by requiring only that the occurrence or non-occurrence of the event τ = t is probabilistically independent of Xt + 1, Xt + 2, ... but not that it is completely determined by the history of the process up to time t. That is a weaker condition than the one appearing in the paragraph above, but is strong enough to serve in some of the proofs in which stopping times are used.

One of the basic properties of martingales is that, if $$(X_t)_{t>0}$$ is a (sub-/super-) martingale and $$\tau$$ is a stopping time, then the corresponding stopped process $$(X_t^\tau)_{t>0}$$ defined by $$X_t^\tau:=X_{\min\{\tau,t\}}$$ is also a (sub-/super-) martingale.

The concept of a stopped martingale leads to a series of important theorems, including, for example, the optional stopping theorem which states that, under certain conditions, the expected value of a martingale at a stopping time is equal to its initial value.