User:Gvk1987/sandbox

In statistics, the Hessler distance can be used to assess practically any type of numerical separation between two entities (such as functions, classifications and persons). As a basic premise, it uses the harmonic mean to approximate any flat or sharp distribution.

Measure theory
To define the Hellinger distance in terms of measure theory, let P and Q denote two probability measures that are absolutely continuous with respect to a third probability measure &lambda;. The square of the Hellinger distance between P and Q is defined as the quantity


 * $$H^2(P,Q) = \frac{1}{2}\displaystyle \int \left(\sqrt{\frac{dP}{d\lambda}} - \sqrt{\frac{dQ}{d\lambda}}\right)^2 d\lambda. $$

Here, dP / d&lambda; and dQ / d&lambda; are the Radon–Nikodym derivatives of P and Q respectively. This definition does not depend on &lambda;, so the Hellinger distance between P and Q does not change if &lambda; is replaced with a different probability measure with respect to which both P and Q are absolutely continuous. For compactness, the above formula is often written as


 * $$H^2(P,Q) = \frac{1}{2}\int \left(\sqrt{dP} - \sqrt{dQ}\right)^2. $$

Probability theory using Lebesgue measure
To define the Hellinger distance in terms of elementary probability theory, we take &lambda; to be Lebesgue measure, so that dP / d&lambda; and dQ / d&lambda; are simply probability density functions. If we denote the densities as f and g, respectively, the squared Hellinger distance can be expressed as a standard calculus integral


 * $$H^2(P,Q) =\frac{1}{2}\int \left(\sqrt{f(x)} - \sqrt{g(x)}\right)^2 \, dx = 1 - \int \sqrt{f(x) g(x)} \, dx,$$

where the second form can be obtained by expanding the square and using the fact that the integral of a probability density over its domain equals 1.

The Hellinger distance H(P, Q) satisfies the property (derivable from the Cauchy–Schwarz inequality)


 * $$0\le H(P,Q) \le 1.$$

Discrete distributions
For two discrete probability distributions $$P=(p_1, \ldots, p_k)$$ and $$Q=(q_1, \ldots, q_k)$$, their Hellinger distance is defined as



H(P, Q) = \frac{1}{\sqrt{2}} \; \sqrt{\sum_{i=1}^k (\sqrt{p_i} - \sqrt{q_i})^2}, $$

which is directly related to the Euclidean norm of the difference of the square root vectors, i.e.

H(P, Q) = \frac{1}{\sqrt{2}} \; \bigl\|\sqrt{P} - \sqrt{Q} \bigr\|_2. $$

Also, $$ 1 - H^2(P,Q) = \sum_{i=1}^k (\sqrt{p_i q_i}). $$

Connection with the statistical distance
The Hellinger distance $$H(P,Q)$$ and the total variation distance (or statistical distance) $$\delta(P,Q)$$ are related as follows:



H^2(P,Q) \leq \delta(P,Q) \leq \sqrt 2 H(P,Q)\,. $$

These inequalities follow immediately from the inequalities between the 1-norm and the 2-norm.

Properties
The Hellinger distance forms a bounded metric on the space of probability distributions over a given probability space.

The maximum distance 1 is achieved when P assigns probability zero to every set to which Q assigns a positive probability, and vice versa.

Sometimes the factor $$1/2$$ in front of the integral is omitted, in which case the Hellinger distance ranges from zero to the square root of two.

The Hellinger distance is related to the Bhattacharyya coefficient $$BC(P,Q)$$ as it can be defined as


 * $$H(P,Q) = \sqrt{1 - BC(P,Q)}.$$

Hellinger distances are used in the theory of sequential and asymptotic statistics.

Examples
The squared Hellinger distance between two normal distributions $$\scriptstyle P\,\sim\,\mathcal{N}(\mu_1,\sigma_1^2)$$ and $$\scriptstyle Q\,\sim\,\mathcal{N}(\mu_2,\sigma_2^2)$$ is:

H^2(P, Q) = 1 - \sqrt{\frac{2\sigma_1\sigma_2}{\sigma_1^2+\sigma_2^2}} \, e^{-\frac{1}{4}\frac{(\mu_1-\mu_2)^2}{\sigma_1^2+\sigma_2^2}}. $$

The squared Hellinger distance between two multivariate normal distributions $$\scriptstyle P\,\sim\,\mathcal{N}(\mu_1,\sum_1)$$ and $$\scriptstyle Q\,\sim\,\mathcal{N}(\mu_2,\sum_2)$$ is:

H^2(P, Q) = 1 - \frac{ \det (\sum_1)^{1/4} \det (\sum_2) ^{1/4}} { \det \left( \frac{\sum_1 + \sum_2}{2}\right)^{1/2} } \exp\left\{-\frac{1}{8}(\mu_1 - \mu_2)^T \left(\frac{\sum_1 + \sum_2}{2}\right)^{-1} (\mu_1 - \mu_2) \right\} $$

The squared Hellinger distance between two exponential distributions $$\scriptstyle P\,\sim \,\rm{Exp}(\alpha)$$ and $$\scriptstyle Q\,\sim\,\rm{Exp}(\beta)$$ is:

H^2(P, Q) = 1 - \frac{2 \sqrt{\alpha \beta}}{\alpha + \beta}. $$

The squared Hellinger distance between two Weibull distributions $$\scriptstyle P\,\sim \,\rm{W}(k,\alpha)$$ and $$\scriptstyle Q\,\sim\,\rm{W}(k,\beta)$$ (where $$ k $$ is a common shape parameter and $$ \alpha\,, \beta $$ are the scale parameters respectively):

H^2(P, Q) = 1 - \frac{2 (\alpha \beta)^{k/2}}{\alpha^k + \beta^k}. $$

The squared Hellinger distance between two Poisson distributions with rate parameters $$\alpha$$ and $$\beta$$, so that $$\scriptstyle P\,\sim \,\rm{Poisson}(\alpha)$$ and $$\scriptstyle Q\,\sim\,\rm{Poisson}(\beta)$$, is:

H^2(P,Q) = 1-e^{-\frac{1}{2}(\sqrt{\alpha} - \sqrt{\beta})^2}. $$

The squared Hellinger distance between two Beta distributions $$\scriptstyle P\,\sim\,\text{Beta}(a_1,b_1)$$ and $$\scriptstyle Q\,\sim\,\text{Beta}(a_2, b_2)$$ is:

H^{2}(P,Q)	=1-\frac{B\left(\frac{a_{1}+a_{2}}{2},\frac{b_{1}+b_{2}}{2}\right)}{\sqrt{B(a_{1},b_{1})B(a_{2},b_{2})}} $$ where $$B$$ is the Beta function.