Hartley function

The Hartley function is a measure of uncertainty, introduced by Ralph Hartley in 1928. If a sample from a finite set A uniformly at random is picked, the information revealed after the outcome is known is given by the Hartley function
 * $$ H_0(A) := \mathrm{log}_b \vert A \vert ,$$

where $|A|$ denotes the cardinality of A.

If the base of the logarithm is 2, then the unit of uncertainty is the shannon (more commonly known as bit). If it is the natural logarithm, then the unit is the nat. Hartley used a base-ten logarithm, and with this base, the unit of information is called the hartley (aka ban or dit) in his honor. It is also known as the Hartley entropy or max-entropy.

Hartley function, Shannon entropy, and Rényi entropy
The Hartley function coincides with the Shannon entropy (as well as with the Rényi entropies of all orders) in the case of a uniform probability distribution. It is a special case of the Rényi entropy since:
 * $$H_0(X) = \frac 1 {1-0} \log \sum_{i=1}^{|\mathcal{X}|} p_i^0 = \log |\mathcal{X}|.$$

But it can also be viewed as a primitive construction, since, as emphasized by Kolmogorov and Rényi, the Hartley function can be defined without introducing any notions of probability (see Uncertainty and information by George J. Klir, p. 423).

Characterization of the Hartley function
The Hartley function only depends on the number of elements in a set, and hence can be viewed as a function on natural numbers. Rényi showed that the Hartley function in base 2 is the only function mapping natural numbers to real numbers that satisfies


 * 1) $$H(mn) = H(m)+H(n)$$ (additivity)
 * 2) $$H(m) \leq H(m+1)$$ (monotonicity)
 * 3) $$H(2)=1$$ (normalization)

Condition 1 says that the uncertainty of the Cartesian product of two finite sets A and B is the sum of uncertainties of A and B. Condition 2 says that a larger set has larger uncertainty.

Derivation of the Hartley function
We want to show that the Hartley function, log2(n), is the only function mapping natural numbers to real numbers that satisfies
 * 1) $$H(mn) = H(m)+H(n)\,$$ (additivity)
 * 2) $$H(m) \leq H(m+1)\,$$ (monotonicity)
 * 3) $$H(2)=1\,$$ (normalization)

Let f be a function on positive integers that satisfies the above three properties. From the additive property, we can show that for any integer n and k,
 * $$f(n^k) = kf(n).\,$$

Let a, b, and t be any positive integers. There is a unique integer s determined by
 * $$a^s \leq b^t \leq a^{s+1}. \qquad(1)$$

Therefore,
 * $$s \log_2 a\leq t \log_2 b \leq (s+1) \log_2 a \, $$

and
 * $$\frac{s}{t} \leq \frac{\log_2 b}{\log_2 a} \leq \frac{s+1}{t}.$$

On the other hand, by monotonicity,
 * $$f(a^s) \leq f(b^t) \leq f(a^{s+1}). \, $$

Using equation (1), one gets
 * $$s f(a) \leq t f(b) \leq (s+1) f(a),\,$$

and
 * $$\frac{s}{t} \leq \frac{f(b)}{f(a)} \leq \frac{s+1}{t}.$$

Hence,
 * $$\left\vert \frac{f(b)}{f(a)} - \frac{\log_2(b)}{\log_2(a)} \right\vert \leq \frac{1}{t}.$$

Since t can be arbitrarily large, the difference on the left hand side of the above inequality must be zero,
 * $$\frac{f(b)}{f(a)} = \frac{\log_2(b)}{\log_2(a)}.$$

So,
 * $$f(a) = \mu \log_2(a)\,$$

for some constant μ, which must be equal to 1 by the normalization property.