Generalized relative entropy

Generalized relative entropy ($$\epsilon$$-relative entropy) is a measure of dissimilarity between two quantum states. It is a "one-shot" analogue of quantum relative entropy and shares many properties of the latter quantity.

In the study of quantum information theory, we typically assume that information processing tasks are repeated multiple times, independently. The corresponding information-theoretic notions are therefore defined in the asymptotic limit. The quintessential entropy measure, von Neumann entropy, is one such notion. In contrast, the study of one-shot quantum information theory is concerned with information processing when a task is conducted only once. New entropic measures emerge in this scenario, as traditional notions cease to give a precise characterization of resource requirements. $$\epsilon$$-relative entropy is one such particularly interesting measure.

In the asymptotic scenario, relative entropy acts as a parent quantity for other measures besides being an important measure itself. Similarly, $$\epsilon$$-relative entropy functions as a parent quantity for other measures in the one-shot scenario.

Definition
To motivate the definition of the $$\epsilon$$-relative entropy $$D^{\epsilon}(\rho||\sigma)$$, consider the information processing task of hypothesis testing. In hypothesis testing, we wish to devise a strategy to distinguish between two density operators $$\rho$$ and $$\sigma$$. A strategy is a POVM with elements $$Q$$ and $$I - Q$$. The probability that the strategy produces a correct guess on input $$\rho$$ is given by $$\operatorname{Tr}(\rho Q)$$ and the probability that it produces a wrong guess is given by $$\operatorname{Tr}(\sigma Q)$$. $$\epsilon$$-relative entropy captures the minimum probability of error when the state is $$\sigma$$, given that the success probability for $$\rho$$ is at least $$\epsilon$$.

For $$\epsilon \in (0,1)$$, the $$\epsilon$$-relative entropy between two quantum states$$\rho$$ and $$\sigma$$ is defined as
 * $$ D^{\epsilon}(\rho||\sigma) = - \log \frac{1}{\epsilon} \min \{ \langle Q, \sigma \rangle | 0 \leq Q \leq I \text{ and } \langle Q ,\rho\rangle \geq \epsilon\} ~.$$

From the definition, it is clear that $$D^{\epsilon}(\rho||\sigma)\geq 0$$. This inequality is saturated if and only if $$\rho = \sigma$$, as shown below.

Relationship to the trace distance
Suppose the trace distance between two density operators $$\rho$$ and $$\sigma$$ is
 * $$||\rho - \sigma||_1 = \delta ~.$$

For $$0< \epsilon< 1$$, it holds that
 * a) $$\log \frac{\epsilon}{\epsilon - (1-\epsilon)\delta} \quad \leq \quad D^{\epsilon}(\rho||\sigma) \quad \leq \quad \log \frac{\epsilon}{\epsilon - \delta} ~.$$

In particular, this implies the following analogue of the Pinsker inequality


 * b) $$\frac{1-\epsilon}{\epsilon}||\rho-\sigma||_1 \quad \leq \quad D^{\epsilon}(\rho||\sigma) ~.$$

Furthermore, the proposition implies that for any $$\epsilon \in (0,1)$$, $$D^{\epsilon}(\rho||\sigma) = 0$$ if and only if $$\rho = \sigma$$, inheriting this property from the trace distance. This result and its proof can be found in Dupuis et al.

Proof of inequality a)
Upper bound: Trace distance can be written as


 * $$ ||\rho - \sigma||_1 = \max_{0\leq Q \leq 1} \operatorname{Tr}(Q(\rho - \sigma)) ~.$$

This maximum is achieved when $$Q$$ is the orthogonal projector onto the positive eigenspace of $$\rho - \sigma$$. For any POVM element $$Q$$ we have
 * $$\operatorname{Tr}(Q(\rho - \sigma)) \leq \delta$$

so that if $$\operatorname{Tr}(Q\rho) \geq \epsilon$$, we have
 * $$\operatorname{Tr}(Q\sigma) ~\geq~ \operatorname{Tr}(Q\rho) - \delta ~\geq~ \epsilon - \delta~.$$

From the definition of the $$\epsilon$$-relative entropy, we get
 * $$2^{- D^{\epsilon}(\rho||\sigma)}\geq \frac{\epsilon - \delta}{\epsilon} ~.$$

Lower bound: Let $$Q$$ be the orthogonal projection onto the positive eigenspace of $$\rho - \sigma$$, and let $$\bar Q$$ be the following convex combination of $$I$$ and $$Q$$:
 * $$	\bar Q = (\epsilon - \mu)I + (1 - \epsilon + \mu)Q$$

where $$\mu = \frac{(1-\epsilon)\operatorname{Tr}(Q\rho)}{1 - \operatorname{Tr}(Q\rho)} ~.$$

This means
 * $$\mu = (1-\epsilon + \mu)\operatorname{Tr}(Q\rho)$$

and thus
 * $$ \operatorname{Tr}(\bar Q \rho) ~=~ (\epsilon - \mu) + (1-\epsilon + \mu)\operatorname{Tr}(Q\rho) ~=~ \epsilon ~.$$

Moreover,
 * $$\operatorname{Tr}(\bar Q \sigma) ~=~ \epsilon - \mu + (1-\epsilon + \mu)\operatorname{Tr}(Q\sigma) ~.$$

Using $$\mu = (1-\epsilon + \mu)\operatorname{Tr}(Q\rho)$$, our choice of $$Q$$, and finally the definition of $$\mu$$, we can re-write this as
 * $$\operatorname{Tr}(\bar Q \sigma) ~=~ \epsilon - (1 - \epsilon + \mu)\operatorname{Tr}(Q\rho) + (1 - \epsilon + \mu)\operatorname{Tr}(Q\sigma)$$
 * $$ ~=~ \epsilon - \frac{(1-\epsilon)\delta}{1-\operatorname{Tr} (Q\rho)} ~\leq~ \epsilon - (1-\epsilon)\delta ~.

$$

Hence
 * $$D^{\epsilon}(\rho||\sigma) \geq \log \frac{\epsilon}{\epsilon - (1-\epsilon)\delta} ~.$$

Proof of inequality b)
To derive this Pinsker-like inequality, observe that
 * $$\log \frac{\epsilon}{\epsilon - (1-\epsilon)\delta} ~=~ -\log\left( 1 - \frac{(1-\epsilon)\delta}{\epsilon} \right) ~\geq~ \delta \frac{1-\epsilon}{\epsilon} ~.$$

Alternative proof of the Data Processing inequality
A fundamental property of von Neumann entropy is strong subadditivity. Let $$S(\sigma)$$ denote the von Neumann entropy of the quantum state $$\sigma$$, and let $$\rho_{ABC}$$ be a quantum state on the tensor product Hilbert space $$\mathcal{H}_A\otimes \mathcal{H}_B \otimes \mathcal{H}_C$$. Strong subadditivity states that
 * $$S(\rho_{ABC}) + S(\rho_B) \leq S(\rho_{AB}) + S(\rho_{BC})$$

where $$\rho_{AB},\rho_{BC},\rho_{B}$$ refer to the reduced density matrices on the spaces indicated by the subscripts. When re-written in terms of mutual information, this inequality has an intuitive interpretation; it states that the information content in a system cannot increase by the action of a local quantum operation on that system. In this form, it is better known as the data processing inequality, and is equivalent to the monotonicity of relative entropy under quantum operations:
 * $$S(\rho||\sigma) - S(\mathcal{E}(\rho)||\mathcal{E}(\sigma)) \geq 0$$

for every CPTP map $$\mathcal{E}$$, where $$S(\omega||\tau)$$ denotes the relative entropy of the quantum states $$\omega, \tau$$.

It is readily seen that $$\epsilon$$-relative entropy also obeys monotonicity under quantum operations:
 * $$D^{\epsilon}(\rho||\sigma) \geq D^{\epsilon}(\mathcal{E}(\rho)||\mathcal{E}(\sigma))$$,

for any CPTP map $$\mathcal{E}$$. To see this, suppose we have a POVM $$(R,I-R)$$ to distinguish between $$\mathcal{E}(\rho)$$ and $$\mathcal{E}(\sigma)$$ such that $$\langle R, \mathcal{E}(\rho)\rangle = \langle \mathcal{E}^{\dagger}(R), \rho \rangle \geq \epsilon$$. We construct a new POVM $$(\mathcal{E}^{\dagger}(R), I - \mathcal{E}^{\dagger}(R))$$ to distinguish between $$\rho$$ and $$\sigma$$. Since the adjoint of any CPTP map is also positive and unital, this is a valid POVM. Note that $$\langle R, \mathcal{E}(\sigma)\rangle = \langle \mathcal{E}^{\dagger}(R), \sigma\rangle \geq \langle Q,\sigma\rangle$$, where $$(Q, I-Q)$$ is the POVM that achieves $$D^{\epsilon}(\rho||\sigma)$$. Not only is this interesting in itself, but it also gives us the following alternative method to prove the data processing inequality.

By the quantum analogue of the Stein lemma,
 * $$\lim_{n\rightarrow\infty}\frac{1}{n}D^{\epsilon}(\rho^{\otimes n}||\sigma^{\otimes n})  = \lim_{n\rightarrow\infty}\frac{-1}{n}\log \min \frac{1}{\epsilon}\operatorname{Tr}(\sigma^{\otimes n} Q) $$
 * $$ = D(\rho||\sigma) - \lim_{n\rightarrow\infty}\frac{1}{n}\left( \log\frac{1}{\epsilon} \right) $$
 * $$ = D(\rho||\sigma) ~, $$

where the minimum is taken over $$0\leq Q\leq 1$$ such that $$\operatorname{Tr}(Q\rho^{\otimes n})\geq \epsilon ~.$$

Applying the data processing inequality to the states $$\rho^{\otimes n}$$ and $$\sigma^{\otimes n}$$ with the CPTP map $$\mathcal{E}^{\otimes n}$$, we get
 * $$D^{\epsilon}(\rho^{\otimes n}||\sigma^{\otimes n}) ~\geq~ D^{\epsilon}(\mathcal{E}(\rho)^{\otimes n}||\mathcal{E}(\sigma)^{\otimes n}) ~.$$

Dividing by $$n$$ on either side and taking the limit as $$n \rightarrow\infty$$, we get the desired result.