Entropic uncertainty

In quantum mechanics, information theory, and Fourier analysis, the entropic uncertainty or Hirschman uncertainty is defined as the sum of the temporal and spectral Shannon entropies. It turns out that Heisenberg's uncertainty principle can be expressed as a lower bound on the sum of these entropies. This is stronger than the usual statement of the uncertainty principle in terms of the product of standard deviations.

In 1957, Hirschman considered a function f and its Fourier transform g such that
 * $$g(y) \approx \int_{-\infty}^\infty \exp (-2\pi ixy) f(x)\, dx,\qquad f(x) \approx \int_{-\infty}^\infty \exp (2\pi ixy) g(y)\, dy ~,$$

where the  "≈" indicates convergence in $L$2, and normalized so that (by Plancherel's theorem),
 * $$ \int_{-\infty}^\infty |f(x)|^2\, dx = \int_{-\infty}^\infty |g(y)|^2 \,dy = 1~.$$

He showed that for any such functions the sum of the Shannon entropies is non-negative,
 * $$ H(|f|^2) + H(|g|^2) \equiv - \int_{-\infty}^\infty |f(x)|^2 \log |f(x)|^2\, dx - \int_{-\infty}^\infty |g(y)|^2 \log |g(y)|^2 \,dy \ge 0. $$

A tighter bound,

was conjectured by Hirschman and Everett, proven in 1975 by W. Beckner and in the same year interpreted as a generalized quantum mechanical uncertainty principle by Białynicki-Birula and Mycielski. The equality holds in the case of Gaussian distributions. Note, however, that the above entropic uncertainty function is distinctly different from the quantum Von Neumann entropy represented in phase space.

Sketch of proof
The proof of this tight inequality depends on the so-called (q, p)-norm of the Fourier transformation. (Establishing this norm is the most difficult part of the proof.)

From this norm, one is able to establish a lower bound on the sum of the (differential) Rényi entropies, $H_{α}(|f|²)+H_{β}(|g|²)$, where $1/α + 1/β = 2$, which generalize the Shannon entropies. For simplicity, we consider this inequality only in one dimension; the extension to multiple dimensions is straightforward and can be found in the literature cited.

Babenko–Beckner inequality
The (q, p)-norm of the Fourier transform is defined to be


 * $$\|\mathcal F\|_{q,p} = \sup_{f\in L^p(\mathbb R)} \frac{\|\mathcal Ff\|_q}{\|f\|_p},$$  where $$1 < p \le 2~,$$   and $$\frac 1 p + \frac 1 q = 1.$$

In 1961, Babenko found this norm for even integer values of q. Finally, in 1975, using Hermite functions as eigenfunctions of the Fourier transform, Beckner proved that the value of this norm (in one dimension) for all q ≥  2  is
 * $$\|\mathcal F\|_{q,p} = \sqrt{p^{1/p}/q^{1/q}}.$$

Thus we have the Babenko–Beckner inequality that
 * $$\|\mathcal Ff\|_q \le \left(p^{1/p}/q^{1/q}\right)^{1/2} \|f\|_p.$$

Rényi entropy bound
From this inequality, an expression of the uncertainty principle in terms of the Rényi entropy can be derived.

Letting $$g=\mathcal Ff$$, 2α=p,  and 2β=q,   so that  $1/α + 1/β = 2$  and   1/2<α<1<β,  we have
 * $$\left(\int_{\mathbb R} |g(y)|^{2\beta}\,dy\right)^{1/2\beta}

\le \frac{(2\alpha)^{1/4\alpha}}{(2\beta)^{1/4\beta}} \left(\int_{\mathbb R} |f(x)|^{2\alpha}\,dx\right)^{1/2\alpha}. $$ Squaring both sides and taking the logarithm, we get
 * $$\frac 1\beta \log\left(\int_{\mathbb R} |g(y)|^{2\beta}\,dy\right)

\le \frac 1 2 \log\frac{(2\alpha)^{1/\alpha}}{(2\beta)^{1/\beta}} + \frac 1\alpha \log \left(\int_{\mathbb R} |f(x)|^{2\alpha}\,dx\right). $$

Multiplying both sides by
 * $$\frac{\beta}{1-\beta}=-\frac{\alpha}{1-\alpha}$$

reverses the sense of the inequality,
 * $$\frac {1}{1-\beta} \log\left(\int_{\mathbb R} |g(y)|^{2\beta}\,dy\right)

\ge \frac\alpha{2(\alpha-1)}\log\frac{(2\alpha)^{1/\alpha}}{(2\beta)^{1/\beta}} - \frac{1}{1-\alpha} \log \left(\int_{\mathbb R} |f(x)|^{2\alpha}\,dx\right) ~. $$

Rearranging terms, finally yields an inequality in terms of the sum of the Rényi entropies,
 * $$\frac{1}{1-\alpha} \log \left(\int_{\mathbb R} |f(x)|^{2\alpha}\,dx\right)

+ \frac {1}{1-\beta} \log\left(\int_{\mathbb R} |g(y)|^{2\beta}\,dy\right) \ge \frac\alpha{2(\alpha-1)}\log\frac{(2\alpha)^{1/\alpha}}{(2\beta)^{1/\beta}}; $$
 * $$ H_\alpha(|f|^2) + H_\beta(|g|^2) \ge \frac 1 2 \left(\frac{\log\alpha}{\alpha-1}+\frac{\log\beta}{\beta-1}\right) - \log 2    ~.$$

Note that this inequality is symmetric with respect to $α$  and $β$:  One no longer need assume that $α<β$;  only that they are positive and not both one, and that  1/α + 1/β   = 2. To see this symmetry, simply exchange the rôles of i  and −i  in the Fourier transform.

Shannon entropy bound
Taking the limit of this last inequality as α, β → 1 yields the less general Shannon entropy inequality,
 * $$H(|f|^2) + H(|g|^2) \ge \log\frac e 2,\quad\textrm{where}\quad g(y) \approx \int_{\mathbb R} e^{-2\pi ixy}f(x)\,dx~,$$

valid for any base of logarithm, as long as we choose an appropriate unit of information, bit, nat, etc.

The constant will be different, though, for a different normalization of the Fourier transform, (such as is usually used in physics, with normalizations chosen so that ħ=1 ), i.e.,
 * $$H(|f|^2) + H(|g|^2) \ge \log(\pi e)\quad\textrm{for}\quad g(y) \approx \frac 1{\sqrt{2\pi}}\int_{\mathbb R} e^{-ixy}f(x)\,dx~.$$

In this case, the dilation of the Fourier transform absolute squared by a factor of 2$π$ simply adds log(2$π$) to its entropy.

Entropy versus variance bounds
The Gaussian or normal probability distribution plays an important role in the relationship between variance and entropy: it is a problem of the calculus of variations to show that this distribution maximizes entropy for a given variance, and at the same time minimizes the variance for a given entropy. In fact, for any probability density function $$\phi$$ on the real line, Shannon's entropy inequality specifies:
 * $$H(\phi) \le \log \sqrt {2\pi eV(\phi)},$$

where H is the Shannon entropy and V is the variance, an inequality that is saturated only in the case of a normal distribution.

Moreover, the Fourier transform of a Gaussian probability amplitude function is also Gaussian—and the absolute squares of both of these are Gaussian, too. This can then be used to derive the usual Robertson variance uncertainty inequality from the above entropic inequality, enabling the latter to be tighter than the former. That is (for ħ=1), exponentiating the Hirschman inequality and using Shannon's expression above,
 * $$1/2 \le \exp (H(|f|^2)+H(|g|^2))        /(2e\pi)    \le \sqrt {V(|f|^2)V(|g|^2)}~.$$

Hirschman explained that entropy—his version of entropy was the negative of Shannon's—is a "measure of the concentration of [a probability distribution] in a set of small measure." Thus a low or large negative Shannon entropy means that a considerable mass of the probability distribution is confined to a set of small measure.

Note that this set of small measure need not be contiguous; a probability distribution can have several concentrations of mass in intervals of small measure, and the entropy may still be low no matter how widely scattered those intervals are. This is not the case with the variance: variance measures the concentration of mass about the mean of the distribution, and a low variance means that a considerable mass of the probability distribution is concentrated in a contiguous interval of small measure.

To formalize this distinction, we say that two probability density functions $$\phi_1$$ and $$\phi_2$$ are equimeasurable if


 * $$\forall \delta > 0,\,\mu\{x\in\mathbb R|\phi_1(x)\ge\delta\} = \mu\{x\in\mathbb R|\phi_2(x)\ge\delta\},$$

where $μ$ is the Lebesgue measure. Any two equimeasurable probability density functions have the same Shannon entropy, and in fact the same Rényi entropy, of any order. The same is not true of variance, however. Any probability density function has a radially decreasing equimeasurable "rearrangement" whose variance is less (up to translation) than any other rearrangement of the function; and there exist rearrangements of arbitrarily high variance, (all having the same entropy.)