Quantum mutual information

In quantum information theory, quantum mutual information, or von Neumann mutual information, after John von Neumann, is a measure of correlation between subsystems of quantum state. It is the quantum mechanical analog of Shannon mutual information.

Motivation
For simplicity, it will be assumed that all objects in the article are finite-dimensional.

The definition of quantum mutual entropy is motivated by the classical case. For a probability distribution of two variables p(x, y), the two marginal distributions are


 * $$p(x) = \sum_{y} p(x,y), \qquad p(y) = \sum_{x} p(x,y).$$

The classical mutual information I(X:Y) is defined by


 * $$I(X:Y) = S(p(x)) + S(p(y)) - S(p(x,y))$$

where S(q) denotes the Shannon entropy of the probability distribution q.

One can calculate directly


 * $$\begin{align}

S(p(x)) + S(p(y)) &= - \left (\sum_x p_x \log p(x) + \sum_y p_y \log p(y) \right ) \\ &= -\left (\sum_x \left ( \sum_{y'} p(x,y') \log \sum_{y'} p(x,y') \right ) + \sum_y \left ( \sum_{x'} p(x',y) \log \sum_{x'} p(x',y) \right ) \right ) \\ &= -\left (\sum_{x,y} p(x,y) \left (\log \sum_{y'} p(x,y') + \log \sum_{x'} p(x',y) \right ) \right )\\ &= -\sum_{x,y} p(x,y) \log p(x) p(y) \end{align}$$

So the mutual information is


 * $$I(X:Y) = \sum_{x,y} p(x,y) \log \frac{p(x,y)}{p(x) p(y)},$$

Where the logarithm is taken in basis 2 to obtain the mutual information in bits. But this is precisely the relative entropy between p(x, y) and p(x)p(y). In other words, if we assume the two variables x and y to be uncorrelated, mutual information is the discrepancy in uncertainty resulting from this (possibly erroneous) assumption.

It follows from the property of relative entropy that I(X:Y) &ge; 0 and equality holds if and only if p(x, y) = p(x)p(y).

Definition
The quantum mechanical counterpart of classical probability distributions are modeled with density matrices.

Consider a quantum system that can be divided into two parts, A and B, such that independent measurements can be made on either part. The state space of the entire quantum system is then the tensor product of the spaces for the two parts.


 * $$H_{AB} := H_A \otimes H_B.$$

Let ρAB be a density matrix acting on states in HAB. The von Neumann entropy of a density matrix S(ρ), is the quantum mechanical analogy of the Shannon entropy.


 * $$S(\rho) = - \operatorname{Tr} \rho \log \rho.$$

For a probability distribution p(x,y), the marginal distributions are obtained by integrating away the variables x or y. The corresponding operation for density matrices is the partial trace. So one can assign to &rho; a state on the subsystem A by


 * $$\rho^A = \operatorname{Tr}_B \; \rho^{AB}$$

where TrB is partial trace with respect to system B. This is the reduced state of &rho;AB on system A. The reduced von Neumann entropy of &rho;AB with respect to system A is


 * $$\;S(\rho^A).$$

S(ρB) is defined in the same way.

It can now be seen that the definition of quantum mutual information, corresponding to the classical definition, should be as follows.


 * $$\; I(A\!:\!B) := S(\rho^A) + S(\rho^B) - S(\rho^{AB}).$$

Quantum mutual information can be interpreted the same way as in the classical case: it can be shown that


 * $$I(A\!:\!B) = S(\rho^{AB} \| \rho^A \otimes \rho^B)$$

where $$S(\cdot \| \cdot)$$ denotes quantum relative entropy. Note that there is an alternative generalization of mutual information to the quantum case. The difference between the two for a given state is called quantum discord, a measure for the quantum correlations of the state in question.

Properties
When the state $$\rho^{AB}$$ is pure (and thus $$S(\rho^{AB})=0$$), the mutual information is twice the entanglement entropy of the state:
 * $$I(A\!:\!B) = S(\rho^A) + S(\rho^B) - S(\rho^{AB}) = S(\rho^A) + S(\rho^B) = 2S(\rho^A)$$

A positive quantum mutual information is not necessarily indicative of entanglement, however. A classical mixture of separable states will always have zero entanglement, but can have nonzero QMI, such as
 * $$\rho^{AB} = \frac{1}{2}\left(|00\rangle\langle00| + |11\rangle\langle11|\right)$$

\begin{aligned} I(A\!:\!B) &= S(\rho^A) + S(\rho^B) - S(\rho^{AB})\\ &= S\left(\frac{1}{2}(|0\rangle\langle0| + |1\rangle\langle1|)\right) + S\left(\frac{1}{2}(|0\rangle\langle0| + |1\rangle\langle1|)\right) - S\left(\frac{1}{2}(|00\rangle\langle00| + |11\rangle\langle11|)\right)\\ &= \log 2 +\log 2 - \log 2= \log 2 \end{aligned} $$ In this case, the state is merely a classically correlated state.

Multiparty generalization
Suppose a system is composed by n subsystems $$ A_1,\dots,A_n $$ then:


 * $$I(A_1\!:\!A_2:\dots:A_n) = \sum [S(X_{k_1},\,X_{k_2},\dots,X_{k_{n-1}})]-(n-1)S(A_1,\,A_2,\,\dots,\,A_n)$$

where $$X_{k_i} \in \{A_1,\,A_2,\,\dots,\,A_n\}$$ and the sum is over all the distinct combinations of the subsystem without repetition.

For example, take $$ n=3 $$:


 * $$ I(A\!:\!B\!:\!C)=S(AB)+S(AC)+S(BC)-2S(ABC)$$

Take now $$n=4$$:


 * $$ I(A_1\!:\!A_2\!:\!A_3\!:\!A_4)=S(A_1A_2A_3)+S(A_1A_2A_4)+S(A_1A_3A_4)+S(A_2A_3A_4)-3S(A_1A_2A_3A_4)$$

Note that what we are actually doing is taking the partial trace over one subsystem per time, take the $$ n=4 $$ example, in the first term we are tracing over $$A_4$$, in the second term the trace is over $$A_3$$ and so on.