Van Houtum distribution

In probability theory and statistics, the Van Houtum distribution is a discrete probability distribution named after prof. Geert-Jan van Houtum. It can be characterized by saying that all values of a finite set of possible values are equally probable, except for the smallest and largest element of this set. Since the Van Houtum distribution is a generalization of the discrete uniform distribution, i.e. it is uniform except possibly at its boundaries, it is sometimes also referred to as quasi-uniform.

It is regularly the case that the only available information concerning some discrete random variable are its first two moments. The Van Houtum distribution can be used to fit a distribution with finite support on these moments.

A simple example of the Van Houtum distribution arises when throwing a loaded dice which has been tampered with to land on a 6 twice as often as on a 1. The possible values of the sample space are 1, 2, 3, 4, 5 and 6. Each time the die is thrown, the probability of throwing a 2, 3, 4 or 5 is 1/6; the probability of a 1 is 1/9 and the probability of throwing a 6 is 2/9.

Probability mass function
A random variable U has a Van Houtum (a, b, pa, pb) distribution if its probability mass function is


 * $$\Pr(U=u) = \begin{cases} p_a & \text{if } u=a; \\[8pt]

p_b & \text{if } u=b \\[8pt] \dfrac{1-p_a-p_b}{b-a-1} & \text{if } a<u<b \\[8pt] 0 & \text{otherwise} \end{cases} $$

Fitting procedure
Suppose a random variable $$X$$ has mean $$\mu$$ and squared coefficient of variation $$c^2$$. Let $$U$$ be a Van Houtum distributed random variable. Then the first two moments of $$U$$ match the first two moments of $$X$$ if $$a$$, $$b$$, $$p_a$$ and $$p_b$$ are chosen such that:



\begin{align} a &= \left\lceil \mu - \frac{1}{2} \left\lceil \sqrt{1+12c^2\mu^2} \right\rceil \right\rceil \\[8pt] b &= \left\lfloor \mu + \frac{1}{2} \left\lceil \sqrt{1+12c^2\mu^2} \right\rceil \right\rfloor \\[8pt] p_b &= \frac{(c^2+1)\mu^2-A-(a^2-A)(2\mu-a-b)/(a-b)}{a^2+b^2-2A} \\[8pt] p_a &= \frac{2\mu-a-b}{a-b}+p_b \\[12pt] \text{where } A & = \frac{2a^2+a+2ab-b+2b^2}{6}. \end{align} $$

There does not exist a Van Houtum distribution for every combination of $$\mu$$ and $$c^2$$. By using the fact that for any real mean $$\mu$$ the discrete distribution on the integers that has minimal variance is concentrated on the integers $$\lfloor \mu \rfloor$$ and $$\lceil \mu \rceil$$, it is easy to verify that a Van Houtum distribution (or indeed any discrete distribution on the integers) can only be fitted on the first two moments if


 * $$c^2\mu^2 \geq (\mu-\lfloor \mu \rfloor)(1+\mu-\lceil \mu \rceil)^2+(\mu-\lfloor \mu \rfloor)^2(1+\mu-\lceil \mu \rceil).$$