Stochastic dominance

Stochastic dominance is a partial order between random variables. It is a form of stochastic ordering. The concept arises in decision theory and decision analysis in situations where one gamble (a probability distribution over possible outcomes, also known as prospects) can be ranked as superior to another gamble for a broad class of decision-makers. It is based on shared preferences regarding sets of possible outcomes and their associated probabilities. Only limited knowledge of preferences is required for determining dominance. Risk aversion is a factor only in second order stochastic dominance.

Stochastic dominance does not give a total order, but rather only a partial order: for some pairs of gambles, neither one stochastically dominates the other, since different members of the broad class of decision-makers will differ regarding which gamble is preferable without them generally being considered to be equally attractive.

Throughout the article, $$\rho, \nu$$ stand for probability distributions on $$\R$$, while $$A, B, X, Y, Z$$ stand for particular random variables on $$\R$$. The notation $$X \sim \rho$$ means that $$X$$ has distribution $$\rho$$.

There are a sequence of stochastic dominance orderings, from first $$\succeq_1$$, to second $$\succeq_2$$, to higher orders $$\succeq_n$$. The sequence is increasingly more inclusive. That is, if $$\rho\succeq_n \nu$$, then $$\rho\succeq_{k} \nu$$ for all $$n \leq k$$. Further, there exists $$\rho, \nu$$ such that $$\rho\succeq_{n+1} \nu$$ but not $$\rho\succeq_n \nu$$.

Stochastic dominance could trace back to (Blackwell, 1953), but it was not developed until 1969–1970.

Statewise dominance (Zeroth-Order)
The simplest case of stochastic dominance is statewise dominance (also known as state-by-state dominance), defined as follows:


 * Random variable A is statewise dominant over random variable B if A gives at least as good a result in every state (every possible set of outcomes), and a strictly better result in at least one state.

For example, if a dollar is added to one or more prizes in a lottery, the new lottery statewise dominates the old one because it yields a better payout regardless of the specific numbers realized by the lottery. Similarly, if a risk insurance policy has a lower premium and a better coverage than another policy, then with or without damage, the outcome is better. Anyone who prefers more to less (in the standard terminology, anyone who has monotonically increasing preferences) will always prefer a statewise dominant gamble.

First-order


Statewise dominance implies first-order stochastic dominance (FSD), which is defined as:


 * Random variable A has first-order stochastic dominance over random variable B if for any outcome x, A gives at least as high a probability of receiving at least x as does B, and for some x, A gives a higher probability of receiving at least x. In notation form, $$P [A \ge x]\ge P [B \ge x]$$ for all x, and for some x, $$P[A \ge x]>P[B \ge x]$$.

In terms of the cumulative distribution functions of the two random variables, A dominating B means that $$F_A(x) \le F_B(x)$$ for all x, with strict inequality at some x.

In the case of non-intersecting distribution functions, the Wilcoxon rank-sum test tests for first-order stochastic dominance.

Equivalent definitions
Let $$\rho, \nu$$ be two probability distributions on $$\R$$, such that $$\mathbb E_{X\sim \rho}[|X|], \mathbb E_{X\sim \nu}[|X|] $$ are both finite, then the following conditions are equivalent, thus they may all serve as the definition of first-order stochastic dominance:
 * For any $$u: \R \to \R$$ that is non-decreasing, $$\mathbb E_{X\sim \rho}[u(X)] \geq \mathbb E_{X\sim \nu}[u(X)] $$

The first definition states that a gamble $$\rho$$ first-order stochastically dominates gamble $$\nu$$ if and only if every expected utility maximizer with an increasing utility function prefers gamble $$\rho$$ over gamble $$\nu$$.
 * $$F_\rho(t) \leq F_\nu(t), \quad \forall t \in \R.$$
 * There exists two random variables $$X\sim \rho, Y \sim \nu$$, such that $$X = Y + \delta$$, where $$\delta \geq 0$$.

The third definition states that we can construct a pair of gambles $$X, Y$$ with distributions $$\rho, \nu$$, such that gamble $$X$$ always pays at least as much as gamble $$Y$$. More concretely, construct first a uniformly distributed $$Z\sim\mathrm{Uniform}(0, 1)$$, then use the inverse transform sampling to get $$X = F_X^{-1}(Z), Y = F_Y^{-1}(Z)$$, then $$X \geq Y$$ for any $$Z$$.

Pictorially, the second and third definition are equivalent, because we can go from the graphed density function of A to that of B both by pushing it upwards and pushing it leftwards.

Extended example
Consider three gambles over a single toss of a fair six-sided die:



\begin{array}{rcccccc} \text{State (die result)} & 1 & 2 & 3 & 4 & 5 & 6 \\ \hline \text{gamble A wins }\$ & 1 & 1 & 2 & 2 & 2 & 2 \\ \text{gamble B wins }\$ & 1 & 1 & 1 & 2 & 2 & 2 \\ \text{gamble C wins }\$ & 3 & 3 & 3 & 1 & 1 & 1 \\ \hline \end{array} $$

Gamble A statewise dominates gamble B because A gives at least as good a yield in all possible states (outcomes of the die roll) and gives a strictly better yield in one of them (state 3). Since A statewise dominates B, it also first-order dominates B.

Gamble C does not statewise dominate B because B gives a better yield in states 4 through 6, but C first-order stochastically dominates B because Pr(B ≥ 1) = Pr(C ≥ 1) = 1, Pr(B ≥ 2) = Pr(C ≥ 2) = 3/6, and Pr(B ≥ 3) = 0 while Pr(C ≥ 3) = 3/6 > Pr(B ≥ 3).

Gambles A and C cannot be ordered relative to each other on the basis of first-order stochastic dominance because Pr(A ≥ 2) = 4/6 > Pr(C ≥ 2) = 3/6 while on the other hand Pr(C ≥ 3) = 3/6 > Pr(A ≥ 3) = 0.

In general, although when one gamble first-order stochastically dominates a second gamble, the expected value of the payoff under the first will be greater than the expected value of the payoff under the second, the converse is not true: one cannot order lotteries with regard to stochastic dominance simply by comparing the means of their probability distributions. For instance, in the above example C has a higher mean (2) than does A (5/3), yet C does not first-order dominate A.

Second-order
The other commonly used type of stochastic dominance is second-order stochastic dominance. Roughly speaking, for two gambles $$\rho$$ and $$\nu$$, gamble $$\rho$$ has second-order stochastic dominance over gamble $$\nu$$ if the former is more predictable (i.e. involves less risk) and has at least as high a mean. All risk-averse expected-utility maximizers (that is, those with increasing and concave utility functions) prefer a second-order stochastically dominant gamble to a dominated one. Second-order dominance describes the shared preferences of a smaller class of decision-makers (those for whom more is better and who are averse to risk, rather than all those for whom more is better) than does first-order dominance.

In terms of cumulative distribution functions $$F_\rho$$ and $$F_\nu$$, $$\rho$$ is second-order stochastically dominant over $$\nu$$ if and only if $$\int_{-\infty}^x [F_\nu(t) - F_\rho(t)] \, dt \geq 0$$ for all $$x$$, with strict inequality at some $$x$$. Equivalently, $$\rho$$ dominates $$\nu$$ in the second order if and only if $$\mathbb E_{X\sim \rho}[u(X)] \geq \mathbb E_{X\sim \nu}[u(X)]$$ for all nondecreasing and concave utility functions $$u(x)$$.

Second-order stochastic dominance can also be expressed as follows: Gamble $$\rho$$ second-order stochastically dominates $$\nu$$ if and only if there exist some gambles $$y$$ and $$z$$ such that $$x_\nu \overset {d}{=} (x_\rho + y + z)$$, with $$y$$ always less than or equal to zero, and with $$\mathbb E(z\mid x_\rho+y)=0$$ for all values of $$x_\rho+y$$. Here the introduction of random variable $$y$$ makes $$\nu$$ first-order stochastically dominated by $$\rho$$ (making $$\nu$$ disliked by those with an increasing utility function), and the introduction of random variable $$z$$ introduces a mean-preserving spread in $$\nu$$ which is disliked by those with concave utility. Note that if $$\rho$$ and $$\nu$$ have the same mean (so that the random variable $$y$$ degenerates to the fixed number 0), then $$\nu$$ is a mean-preserving spread of $$\rho$$.

Equivalent definitions
Let $$\rho, \nu$$ be two probability distributions on $$\R$$, such that $$\mathbb E_{X\sim \rho}[|X|], \mathbb E_{X\sim \nu}[|X|] $$ are both finite, then the following conditions are equivalent, thus they may all serve as the definition of second-order stochastic dominance:


 * For any $$u: \R \to \R$$ that is non-decreasing, and (not necessarily strictly) concave,$$\mathbb E_{X\sim \rho}[u(X)] \geq \mathbb E_{X\sim \nu}[u(X)] $$

These are analogous with the equivalent definitions of first-order stochastic dominance, given above.
 * $$\int_{-\infty}^t F_\rho(x) dx \leq \int_{-\infty}^t F_\nu(x) dx, \quad \forall t \in \R.$$
 * There exists two random variables $$X\sim \rho, Y \sim \nu$$, such that $$Y = X - \delta + \epsilon$$, where $$\delta \geq 0$$ and $$\mathbb E[\epsilon| X -\delta] = 0$$.

Sufficient conditions

 * First-order stochastic dominance of A over B is a sufficient condition for second-order dominance of A over B.
 * If B is a mean-preserving spread of A, then A second-order stochastically dominates B.

Necessary conditions

 * $$\mathbb E_\rho(x) \geq \mathbb E_\nu(x)$$ is a necessary condition for A to second-order stochastically dominate B.
 * $$\min_\rho(x)\geq \min_\nu(x)$$ is a necessary condition for A to second-order dominate B. The condition implies that the left tail of $$F_\nu$$ must be thicker than the left tail of $$F_\rho$$.

Third-order
Let $$F_\rho$$ and $$F_\nu$$ be the cumulative distribution functions of two distinct investments $$\rho$$ and $$\nu$$. $$\rho$$ dominates $$\nu$$ in the third order if and only if both

$$
 * $$\int_{-\infty}^x \left(\int_{-\infty}^z [F_\nu(t) - F_\rho(t)] \, dt\right) dz \geq 0 \text{ for all } x,
 * $$\mathbb E_\rho(x) \geq \mathbb E_\nu(x) $$.

Equivalently, $$\rho$$ dominates $$\nu$$ in the third order if and only if $$\mathbb E_\rho U(x) \geq \mathbb E_\nu U(x)$$ for all $$U\in D_3$$.

The set $$D_3$$ has two equivalent definitions:


 * the set of nondecreasing, concave utility functions that are positively skewed (that is, have a nonnegative third derivative throughout).
 * the set of nondecreasing, concave utility functions, such that for any random variable $$Z$$, the risk-premium function $$\pi_u(x, Z)$$ is a monotonically nonincreasing function of $$x$$.

Here, $$\pi_u(x, Z)$$ is defined as the solution to the problem$$u(x + \mathbb E[Z] - \pi) = \mathbb E [u(x + Z)].$$See more details at risk premium page.

Sufficient condition

 * Second-order dominance is a sufficient condition.

Necessary conditions

 * $$\mathbb E_\rho(\log(x))\geq \mathbb E_\nu(\log(x))$$ is a necessary condition. The condition implies that the geometric mean of $$\rho$$ must be greater than or equal to the geometric mean of $$\nu$$.
 * $$\min_\rho(x)\geq\min_\nu(x)$$ is a necessary condition. The condition implies that the left tail of $$F_\nu$$ must be thicker than the left tail of $$F_\rho$$.

Higher-order
Higher orders of stochastic dominance have also been analyzed, as have generalizations of the dual relationship between stochastic dominance orderings and classes of preference functions. Arguably the most powerful dominance criterion relies on the accepted economic assumption of decreasing absolute risk aversion. This involves several analytical challenges and a research effort is on its way to address those.

Formally, the n-th-order stochastic dominance is defined as

$$F^1_\rho(t) = F_\rho(t), \quad F^2_\rho(t) = \int_0^t F^1_\rho(x)dx, \quad \cdots$$
 * For any probability distribution $$\rho$$ on $$[0, \infty)$$, define the functions inductively:
 * For any two probability distributions $$\rho, \nu$$ on $$[0, \infty)$$, non-strict and strict n-th-order stochastic dominance is defined as$$\rho \succeq_n \nu \quad \text{ iff } \quad F^n_\rho \leq F^n_\nu \text { on } [0, \infty)$$$$\rho \succ_n \nu \quad \text{ iff } \quad\rho \succeq_n \nu \text{ and } \rho \neq \nu$$

These relations are transitive and increasingly more inclusive. That is, if $$\rho \succeq_n \nu$$, then $$\rho \succeq_{k} \nu$$ for all $$k \geq n$$. Further, there exists $$\rho, \nu$$ such that $$\rho \succeq_{n+1} \nu$$ but not $$\rho \succeq_n \nu$$.

Define the n-th moment by $$\mu_k(\rho) = \mathbb E_{X\sim \rho}[X^k] = \int x^k dF_\rho(x)$$, then

$$

Constraints
Stochastic dominance relations may be used as constraints in problems of mathematical optimization, in particular stochastic programming. In a problem of maximizing a real functional $$ f(X)$$ over random variables $$ X $$ in a set $$ X_0 $$ we may additionally require that $$ X $$ stochastically dominates a fixed random benchmark $$ B $$. In these problems, utility functions play the role of Lagrange multipliers associated with stochastic dominance constraints. Under appropriate conditions, the solution of the problem is also a (possibly local) solution of the problem to maximize $$ f(X) + \mathbb E[u(X) - u(B)] $$ over $$ X $$ in $$ X_0 $$, where $$ u(x) $$ is a certain utility function. If the first order stochastic dominance constraint is employed, the utility function $$ u(x) $$ is nondecreasing; if the second order stochastic dominance constraint is used, $$ u(x) $$ is nondecreasing and concave. A system of linear equations can test whether a given solution if efficient for any such utility function. Third-order stochastic dominance constraints can be dealt with using convex quadratically constrained programming (QCP).