Coupling (probability)

In probability theory, coupling is a proof technique that allows one to compare two unrelated random variables (distributions) $X$ and $Y$ by creating a random vector $W$ whose marginal distributions correspond to $X$ and $Y$ respectively. The choice of $W$ is generally not unique, and the whole idea of "coupling" is about making such a choice so that $X$ and $Y$ can be related in a particularly desirable way.

Definition
Using the standard formalism of probability theory, let $$X_1$$ and $$X_2$$ be two random variables defined on probability spaces $$(\Omega_1,F_1,P_1)$$ and $$(\Omega_2,F_2,P_2)$$. Then a coupling of $$X_1$$ and $$X_2$$ is a new probability space $$(\Omega,F,P)$$ over which there are two random variables $$Y_1$$ and $$Y_2$$ such that $$Y_1$$ has the same distribution as $$X_1$$ while $$Y_2$$ has the same distribution as $$X_2$$.

An interesting case is when $$Y_1$$ and $$Y_2$$ are not independent.

Random walk
Assume two particles A and B perform a simple random walk in two dimensions, but they start from different points. The simplest way to couple them is simply to force them to walk together. On every step, if A walks up, so does B, if A moves to the left, so does B, etc. Thus, the difference between the two particles' positions stays fixed. As far as A is concerned, it is doing a perfect random walk, while B is the copycat. B holds the opposite view, i.e. that it is, in effect, the original and that A is the copy. And in a sense they both are right. In other words, any mathematical theorem, or result that holds for a regular random walk, will also hold for both A and B.

Consider now a more elaborate example. Assume that A starts from the point (0,0) and B from (10,10). First couple them so that they walk together in the vertical direction, i.e. if A goes up, so does B, etc., but are mirror images in the horizontal direction i.e. if A goes left, B goes right and vice versa. We continue this coupling until A and B have the same horizontal coordinate, or in other words are on the vertical line (5,y). If they never meet, we continue this process forever (the probability of that is zero, though). After this event, we change the coupling rule. We let them walk together in the horizontal direction, but in a mirror image rule in the vertical direction. We continue this rule until they meet in the vertical direction too (if they do), and from that point on, we just let them walk together.

This is a coupling in the sense that neither particle, taken on its own, can "feel" anything we did. Neither the fact that the other particle follows it in one way or the other, nor the fact that we changed the coupling rule or when we did it. Each particle performs a simple random walk. And yet, our coupling rule forces them to meet almost surely and to continue from that point on together permanently. This allows one to prove many interesting results that say that "in the long run", it is not important where you started in order to obtain that particular result.

Biased coins
Assume two biased coins, the first with probability p of turning up heads and the second with probability q > p of turning up heads. Intuitively, if both coins are tossed the same number of times, we should expect the first coin turns up fewer heads than the second one. More specifically, for any fixed k, the probability that the first coin produces at least k heads should be less than the probability that the second coin produces at least k heads. However proving such a fact can be difficult with a standard counting argument. Coupling easily circumvents this problem.

Let X1, X2, ..., Xn be indicator variables for heads in a sequence of flips of the first coin. For the second coin, define a new sequence Y1, Y2, ..., Yn such that


 * if Xi = 1, then Yi = 1,
 * if Xi = 0, then Yi = 1 with probability (q − p)/(1 − p).

Then the sequence of Yi has exactly the probability distribution of tosses made with the second coin. However, because Yi depends on Xi, a toss by toss comparison of the two coins is now possible. That is, for any k ≤ n


 * $$ \Pr(X_1 + \cdots + X_n > k) \leq \Pr(Y_1 + \cdots + Y_n > k).$$

Convergence of Markov Chains to a stationary distribution
Initialize one process $$X_n$$ outside the stationary distribution and initialize another process $$ Y_n $$ inside the stationary distribution. Couple these two independent processes together $$ (X_n,Y_n) $$. As you let time run these two processes will evolve independently. Under certain conditions, these two processes will eventually meet and can be considered the same process at that point. This means that the process outside the stationary distribution converges to the stationary distribution.