Bennett's inequality

In probability theory, Bennett's inequality provides an upper bound on the probability that the sum of independent random variables deviates from its expected value by more than any specified amount. Bennett's inequality was proved by George Bennett of the University of New South Wales in 1962.

Statement
Let $X_{1}, … X_{n}$ be independent random variables with finite variance. Further assume $|X_{i} - EX_{i}| ≤ a$ almost surely for all $i$, and define $$ S_n = \sum_{i = 1}^n \left[X_i - \operatorname{E}(X_i)\right] $$ and $$ \sigma^2 = \sum_{i=1}^n \operatorname{E}(X_i-\operatorname{E} X_i)^2.$$ Then for any $t ≥ 0$,


 * $$\Pr\left( S_n > t \right) \leq

\exp\left( - \frac{\sigma^2}{a^2} h\left(\frac{at}{\sigma^2} \right)\right),$$

where $h(u) = (1 + u)log(1 + u) – u$ and log denotes the natural logarithm.

Generalizations and comparisons to other bounds
For generalizations see Freedman (1975) and Fan, Grama and Liu (2012) for a martingale version of Bennett's inequality and its improvement, respectively.

Hoeffding's inequality only assumes the summands are bounded almost surely, while Bennett's inequality offers some improvement when the variances of the summands are small compared to their almost sure bounds. However Hoeffding's inequality entails sub-Gaussian tails, whereas in general Bennett's inequality has Poissonian tails.

Bennett's inequality is most similar to the Bernstein inequalities, the first of which also gives concentration in terms of the variance and almost sure bound on the individual terms. Bennett's inequality is stronger than this bound, but more complicated to compute.

In both inequalities, unlike some other inequalities or limit theorems, there is no requirement that the component variables have identical or similar distributions.

Example
Suppose that each $X_{i}$ is an independent binary random variable with probability $p$. Then Bennett's inequality says that:


 * $$\Pr\left( \sum_{i = 1}^n X_i > pn + t \right) \leq

\exp\left( - np h\left(\frac{t}{np}\right)\right).$$ For $$t \geq 10 np$$, $$h(\frac{t}{np}) \geq \frac{t}{2np} \log \frac{t}{np},$$ so


 * $$\Pr\left( \sum_{i = 1}^n X_i > pn + t \right) \leq

\left(\frac{t}{np}\right)^{-t/2}$$ for $$t \geq 10 np$$.

By contrast, Hoeffding's inequality gives a bound of $$\exp(-2 t^2/n)$$ and the first Bernstein inequality gives a bound of $$\exp(-\frac{t^2}{2np + 2t/3})$$. For $$t \gg np$$, Hoeffding's inequality gives $$\exp(-\Theta(t^2/n))$$, Bernstein gives $$\exp(-\Theta(t))$$, and Bennett gives $$\exp(-\Theta(t \log \frac{t}{np}))$$.