Beta negative binomial distribution

In probability theory, a beta negative binomial distribution is the probability distribution of a discrete random variable $$X$$ equal to the number of failures needed to get $$r$$ successes in a sequence of independent Bernoulli trials. The probability $$p$$ of success on each trial stays constant within any given experiment but varies across different experiments following a beta distribution. Thus the distribution is a compound probability distribution.

This distribution has also been called both the inverse Markov-Pólya distribution and the generalized Waring distribution or simply abbreviated as the BNB distribution. A shifted form of the distribution has been called the beta-Pascal distribution.

If parameters of the beta distribution are $$\alpha$$ and $$\beta$$, and if

X \mid p \sim \mathrm{NB}(r,p), $$ where

p \sim \textrm{B}(\alpha,\beta), $$ then the marginal distribution of $$X$$ (i.e. the posterior predictive distribution) is a beta negative binomial distribution:

X \sim \mathrm{BNB}(r,\alpha,\beta). $$

In the above, $$\mathrm{NB}(r,p)$$ is the negative binomial distribution and $$\textrm{B}(\alpha,\beta)$$ is the beta distribution.

Definition and derivation
Denoting $$f_{X|p}(k|q), f_{p}(q|\alpha,\beta)$$ the densities of the negative binomial and beta distributions respectively, we obtain the PMF $$f(k|\alpha,\beta,r)$$ of the BNB distribution by marginalization:
 * $$f(k|\alpha,\beta,r) = \int_0^1 f_{X|p}(k|r,q) \cdot f_{p}(q|\alpha,\beta) \mathrm{d} q = \int_0^1 \binom{k+r-1}{k} (1-q)^k q^r \cdot \frac{q^{\alpha-1}(1-q)^{\beta-1}} {\Beta(\alpha,\beta)} \mathrm{d} q =  \frac{1}{\Beta(\alpha,\beta)} \binom{k+r-1}{k} \int_0^1 q^{\alpha+r-1}(1-q)^{\beta+k-1} \mathrm{d} q$$

Noting that the integral evaluates to:
 * $$ \int_0^1 q^{\alpha+r-1}(1-q)^{\beta+k-1} \mathrm{d} q = \frac{\Gamma(\alpha+r)\Gamma(\beta+k)}{\Gamma(\alpha+\beta+k+r)}$$

we can arrive at the following formulas by relatively simple manipulations.

If $$r$$ is an integer, then the PMF can be written in terms of the beta function,:
 * $$f(k|\alpha,\beta,r)=\binom{r+k-1}k\frac{\Beta(\alpha+r,\beta+k)}{\Beta(\alpha,\beta)}$$.

More generally, the PMF can be written
 * $$f(k|\alpha,\beta,r)=\frac{\Gamma(r+k)}{k!\;\Gamma(r)}\frac{\Beta(\alpha+r,\beta+k)}{\Beta(\alpha,\beta)}$$

or
 * $$f(k|\alpha,\beta,r)=\frac{\Beta(r+k,\alpha+\beta)}{\Beta(r,\alpha)}\frac{\Gamma(k+\beta)}{k!\;\Gamma(\beta)}$$.

PMF expressed with Gamma
Using the properties of the Beta function, the PMF with integer $$r$$ can be rewritten as:
 * $$f(k|\alpha,\beta,r)=\binom{r+k-1}k\frac{\Gamma(\alpha+r)\Gamma(\beta+k)\Gamma(\alpha+\beta)}{\Gamma(\alpha+r+\beta+k)\Gamma(\alpha)\Gamma(\beta)}$$.

More generally, the PMF can be written as
 * $$f(k|\alpha,\beta,r)=\frac{\Gamma(r+k)}{k!\;\Gamma(r)}\frac{\Gamma(\alpha+r)\Gamma(\beta+k)\Gamma(\alpha+\beta)}{\Gamma(\alpha+r+\beta+k)\Gamma(\alpha)\Gamma(\beta)}$$.

PMF expressed with the rising Pochammer symbol
The PMF is often also presented in terms of the Pochammer symbol for integer $$r$$
 * $$f(k|\alpha,\beta,r)=\frac{r^{(k)}\alpha^{(r)}\beta^{(k)}}{k!(\alpha+\beta)^{(r+k)}}$$

Factorial Moments
The $k$-th factorial moment of a beta negative binomial random variable $X$ is defined for $$k < \alpha$$ and in this case is equal to


 * $$\operatorname{E}\bigl[(X)_k\bigr] = \frac{\Gamma(r+k)}{\Gamma(r)}\frac{\Gamma(\beta+k)}{\Gamma(\beta)}\frac{\Gamma(\alpha-k)}{\Gamma(\alpha)}.

$$

Non-identifiable
The beta negative binomial is non-identifiable which can be seen easily by simply swapping $$r$$ and $$\beta$$ in the above density or characteristic function and noting that it is unchanged. Thus estimation demands that a constraint be placed on $$r$$, $$\beta$$ or both.

Relation to other distributions
The beta negative binomial distribution contains the beta geometric distribution as a special case when either $$r=1$$ or $$\beta=1$$. It can therefore approximate the geometric distribution arbitrarily well. It also approximates the negative binomial distribution arbitrary well for large $$\alpha$$. It can therefore approximate the Poisson distribution arbitrarily well for large $$\alpha$$, $$\beta$$ and $$r$$.

Heavy tailed
By Stirling's approximation to the beta function, it can be easily shown that for large $$k$$
 * $$f(k|\alpha,\beta,r) \sim \frac{\Gamma(\alpha+r)}{\Gamma(r)\Beta(\alpha,\beta)}\frac{k^{r-1}}{(\beta+k)^{r+\alpha}}$$

which implies that the beta negative binomial distribution is heavy tailed and that moments less than or equal to $$\alpha$$ do not exist.

Beta geometric distribution
The beta geometric distribution is an important special case of the beta negative binomial distribution occurring for $$r=1 $$. In this case the pmf simplifies to


 * $$f(k|\alpha,\beta)=\frac{\mathrm{B}(\alpha+1,\beta+k)} {\mathrm{B}(\alpha,\beta)}$$.

This distribution is used in some Buy Till you Die (BTYD) models.

Further, when $$ \beta=1$$ the beta geometric reduces to the Yule–Simon distribution. However, it is more common to define the Yule-Simon distribution in terms of a shifted version of the beta geometric. In particular, if $$ X \sim BG(\alpha,1) $$ then $$ X+1 \sim YS(\alpha)$$.

Beta negative binomial as a Pólya urn model
In the case when the 3 parameters $$r, \alpha$$ and $$\beta$$ are positive integers, the Beta negative binomial can also be motivated by an urn model - or more specifically a basic Pólya urn model. Consider an urn initially containing $$\alpha$$ red balls (the stopping color) and $$\beta$$ blue balls. At each step of the model, a ball is drawn at random from the urn and replaced, along with one additional ball of the same color. The process is repeated over and over, until $$r$$ red colored balls are drawn. The random variable $$X$$ of observed draws of blue balls are distributed according to a $$\mathrm{BNB}(r, \alpha, \beta)$$. Note, at the end of the experiment, the urn always contains the fixed number $$r+\alpha$$ of red balls while containing the random number $$X+\beta$$ blue balls.

By the non-identifiability property, $$X$$ can be equivalently generated with the urn initially containing $$\alpha$$ red balls (the stopping color) and $$r$$ blue balls and stopping when $$\beta$$ red balls are observed.