Talk:Chernoff bounds

=Base 2 vs natural Logarithms=

Is there a reason logarithms base 2 are being used? Although there is application of relative entropy in information theory, these theorems were not proved in that context, and have applicability beyond them. It would also clean things up by removing some pesky subscripts and constants if natural logarithms were used. --Steve Kroon 14:31, 8 December 2006 (UTC)

History
All the historical part of this article is completely erroneous. Most of the inequalities mentioned were proved by Bernstein (in the 1920-s) and Cramer (in the 1930-s) Sodin 02:32, 9 August 2007 (UTC)

Second (Relative) Chernoff Bound
I think that the second bound stated in the proof section to be "obtainable using a similar proof strategy" should be

\Pr \left[ X < (1-\delta)\mu\right] < \left(\frac{\operatorname{e}^{-\delta}}{(1-\delta)^{(1-\delta)}}\right)^\mu $$

for $$0\leq\delta<1$$ instead of the weaker

\Pr[X < (1-\delta)\mu] < \exp(-\mu\delta^2/2)=\operatorname{e}^{-\frac12\delta^2\cdot\mu} $$.

This can be proven by applying Markov's inequality on $$\operatorname{e}^{-tX}$$ (instead of $$\operatorname{e}^{tX}$$, as done for the first bound) and substituting $$t=-\ln (1-\delta)$$. The weaker bound given in the article right now then follows from basic calculus by taking logarithms and comparing derivatives.

Also, this second bound should probably be stated in the theroem itself. --Björn —The preceding unsigned comment was added by Special:Contributions/ (talk)

Response
I agree, we probably ought to state the second bound. I think you are correct in this, I was just lazy when I wrote the page originally and thought a slightly simpler bound might be nice. --John (Jduchi 21:41, 5 October 2007 (UTC))

First Theorem
I am not quite sure why the first theorem is stated here. I'm not an expert on probability theory, so I cannot tell whether it's valid or not. But surely, it cannot be derived from the Chernoff bounds: applying the Chernoff bounds to the left side of the second bound, one obtains

\operatorname{Pr}\left[X\leq(1-\varepsilon/p)\mu\right]\leq\left(\left(\frac{p}{p-\varepsilon}\right)^{p-\varepsilon}\operatorname{e}^{-\varepsilon}\right)^n $$ Therefore, I would expect (in case the theorem was a simple consequence from the Chernoff bounds) that

\operatorname{e}^{-\varepsilon}\leq\left(\frac{1-p}{1-p+\varepsilon}\right)^{1-p+\varepsilon} \iff -\varepsilon\leq(1-p+\varepsilon)\ln\left(\frac{1-p}{1-p+\varepsilon}\right). $$ Now, terms equal at $$\varepsilon=0$$. Comparing derivatives in $$\varepsilon$$, one should then have

-1 \leq \ln\left(\frac{1-p}{1-p+\varepsilon}\right)+(1-p+\varepsilon)\left(0-\frac1{1-p+\varepsilon}\right) \iff \ln(1-p+\varepsilon)\leq\ln(1-p) $$ which is not true as $$\ln$$ is continuously growing. Therefore, I think that either the theorem is cited wrongly or it doesn't really belong here. Any suggestions? --Bjoern —The preceding unsigned comment was added by Special:Contributions/ (talk)

Response to above
These theorems are actually slightly different and not applications of one another, which explains a little confusion I hope. I have modified the main page to reflect those differences. One theorem deals with absolute error of the mean of the random variables; the other deals with the relative error. So they are derived using the same strategy (i.e. the exponentiating and using Markov's Inequality), but they are different bounds and are used in different contexts. -- John (Jduchi 21:41, 5 October 2007 (UTC))

Proof and statement
In the statement of the Theorem absolute error we claim $$ X_i \in [0,1] $$. But in the proof we use $$ X_i \in {0,1} $$ when we say Now, knowing that $$\Pr[X_i = 1] = p$$, $$\Pr[X_i = 0] = (1-p)$$. --gala.martin ( what? ) 22:06, 8 October 2007 (UTC)
 * You're right. I've changed it. Thanks for the catch! Jduchi 06:37, 23 October 2007 (UTC)

Bounds on delta
Is the last formulation of the theorem: $$\Pr[X < (1-\delta)\mu] < \exp(-\mu\delta^2/2)$$ still valid for $$\delta > 0$$ (in particular, $$\delta \geq 1$$)? It would be quite helpful if these bounds were mentioned explicitly, even if they're the same as in the other formulation of the theorem. —Preceding unsigned comment added by 69.202.72.56 (talk) 05:38, 10 February 2008 (UTC)