Blackwell's informativeness theorem

In the mathematical subjects of information theory and decision theory, Blackwell's informativeness theorem is an important result related to the ranking of information structures, or experiments. It states that there is an equivalence between three possible rankings of information structures: one based in expected utility, one based in informativeness, and one based in feasibility. This ranking defines a partial order over information structures known as the Blackwell order, or Blackwell's criterion.

The theorem states equivalent conditions under which any expected utility maximizing decision maker prefers information structure $$\sigma $$ over $$\sigma '$$, for any decision problem. The result was first proven by David Blackwell in 1951, and generalized in 1953.

Decision making under uncertainty
A decision maker faces a set of possible states of the world $$\Omega $$ and a set of possible actions $$A $$ to take. For every $$\omega \in \Omega $$ and $$a \in A$$, her utility is $$u(\omega, a)$$. She does not know the state of the world $$\omega $$, but has a prior probability $$p : \Omega \rightarrow [0, 1]$$ for every possible state. For every action she takes, her expected utility is


 * $$ \sum_{\omega \in \Omega} u(a, \omega) p(\omega) $$

Given such prior $$ p $$, she chooses an action $$ a \in A $$ to maximize her expected utility. We denote such maximum attainable utility (the expected value of taking the optimal action) by


 * $$ V(p) = \underset{a \in A}\operatorname{max} \sum_{\omega \in \Omega} u(a, \omega) p (\omega) $$

We refer to the data $$(\Omega, A, u, p)$$ as a decision making problem.

Information structures
An information structure (or an experiment) can be seen as way to improve on the utility given by the prior, in the sense of providing more information to the decision maker. Formally, an information structure is a tuple $$(S, \sigma)$$, where $$S$$ is a signal space and $$\sigma : \Omega \rightarrow \Delta S$$ is a function which gives the conditional probability $$\sigma(s | \omega)$$ of observing signal $$s \in S $$ when the state of the world is $$\omega$$. An information structure can also be thought of as the setting of an experiment.

By observing the signal $$s $$, the decision maker can update her beliefs about the state of the world $$ \omega $$ via Bayes' rule, giving the posterior probability


 * $$ \pi (\omega | s) = \frac{p (\omega) \sigma(s | \omega)}{\pi (s)} $$

where $$ \pi(s) := \sum_{\omega' \in \Omega} p (\omega') \sigma(s | \omega') $$. By observing the signal $$ s $$ and updating her beliefs with the information structure $$(S, \sigma)$$, the decision maker's new expected utility value from taking the optimal action is


 * $$ V(\pi, s) = \underset{a \in A}\operatorname{max} \sum_{\omega \in \Omega} u(a, \omega) \pi (\omega | s) $$

and the "expected value of $$(S, \sigma)$$" for the decision maker (i.e., the expected value of taking the optimal action under the information structure) is defined as


 * $$ W(\sigma) = \sum_{s \in S} V(\pi, s) \pi (s) $$

Garbling
If two information structures $$ (S, \sigma) $$ and $$ (S, \sigma') $$ have the same underlying signal space, we abuse some notation and refer to $$ \sigma $$ and $$ \sigma' $$ as information structures themselves. We say that $$ \sigma' $$ is a garbling of $$ \sigma $$ if there exists a stochastic map (for finite signal spaces $$S$$, a Markov matrix) $$\Gamma : S \rightarrow S $$ such that


 * $$\sigma' = \Gamma \sigma $$

Intuitively, garbling is a way of adding "noise" to an information structure, such that the garbled information structure is considered to be less informative.

Feasibility
A mixed strategy in the context of a decision making problem is a function $$\alpha : S \rightarrow \Delta A$$ which gives, for every signal $$s \in S$$, a probability distribution $$ \alpha(a | s) $$ over possible actions in $$A$$. With the information structure $$ (S, \sigma) $$, a strategy $$ \alpha$$ induces a distribution over actions $$\alpha_{\sigma}( a |\omega)$$ conditional on the state of the world $$ \omega $$, given by the mapping


 * $$ \omega \mapsto \alpha_{\sigma}( a |\omega)= \sum_{s \in S} \alpha (a | s) \sigma (s | \omega) \in \Delta A$$

That is, $$\alpha_{\sigma}( a |\omega)$$ gives the probability of taking action $$a \in A$$ given that the state of the world is $$\omega \in \Omega$$ under information structure $$ (S, \sigma) $$ – notice that this is nothing but a convex combination of the $$ \alpha(a | s) $$ with weights $$\sigma (s | \omega)$$. We say that $$\alpha_{\sigma}( a |\omega)$$ is a feasible strategy (or conditional probability over actions) under $$ (S, \sigma) $$.

Given an information structure $$ (S, \sigma) $$, let


 * $$ \Phi_{\sigma} = \{\alpha_{\sigma}( a | \omega) $$ |  $$ \alpha : S \rightarrow \Delta A\}$$

be the set of all conditional probability over actions (i.e., strategies) that are feasible under $$ (S, \sigma) $$.

Given two information structures $$ (S, \sigma) $$ and  $$ (S, \sigma') $$, we say that $$\sigma$$ yields a larger set of feasible strategies than $$\sigma'$$ if


 * $$ \Phi_{\sigma'} \subset \Phi_{\sigma}$$

Statement
Blackwell's theorem states that, given any decision making problem $$ (\Omega, A, u, p) $$ and two information structures $$ \sigma $$ and $$ \sigma' $$, the following are equivalent:


 * 1) $$ W(\sigma') \leq W (\sigma) $$: that is, the decision maker attains a higher expected utility under  $$ \sigma $$ than under $$ \sigma' $$.
 * 2) There exists a stochastic map $$\Gamma $$ such that $$\sigma' = \Gamma \sigma $$: that is,  $$ \sigma' $$ is a garbling of $$ \sigma $$.
 * 3) $$\Phi_{\sigma'} \subset \Phi_{\sigma} $$:, that is $$\sigma$$ yields a larger set of feasible strategies than $$\sigma'$$.

Definition
Blackwell's theorem allows us to construct a partial order over information structures. We say that $$\sigma$$ is more informative in the sense of Blackwell (or simply Blackwell more informative) than $$\sigma'$$ if any (and therefore all) of the conditions of Blackwell's theorem holds, and write $$\sigma' \preceq_B \sigma $$.

The order $$\preceq_B $$ is not a complete one, and most experiments cannot be ranked by it. More specifically, it is a chain of the partially-ordered set of information structures.

Applications
The Blackwell order has many applications in decision theory and economics, in particular in contract theory. For example, if two information structures in a principal-agent model can be ranked in the Blackwell sense, then the more informative one is more efficient in the sense of inducing a smaller cost for second-best implementation.