Talk:Hoeffding's inequality

Mistake
I think the line


 * $$n\geq \frac{\log(2/\alpha)}{2\varepsilon^2}$$

in the "Confidence intervals" section doesn't follow from the previous line. If you want the minimum number of samples $$n$$ such that you can be at least $$\textstyle (1-\alpha)$$% confident that $$\overline{X}$$ is within $$\varepsilon$$ of $$p$$, then I think you actually want


 * $$\operatorname{P}(\ \overline{X} \notin [p-\varepsilon, p+\varepsilon]) \leq 2e^{-2\varepsilon^2n} \leq \alpha$$

which rearranges to give the above inequality for $$n$$.

Someone else comment: No. What is needed is the first (smallest) n which violate inequality. When $$ n $$ gets bigger, probability $$\alpha $$ (for making Type I error) must gets smaller, so probability of confidence interval become larger. Ambiguity is in words "solving the above for n gives us", when in fact what is needed is opposite of the solution to that inequality for the line mentioned. I will try to edit. — Preceding unsigned comment added by 188.254.161.197 (talk) 18:08, 8 August 2023 (UTC)

Mistake
The article refers to Theorem 2 of Hoeffding's paper for the two inequalities:
 * $$\begin{align}

\operatorname{P} \left(\overline X - \mathrm{E}\left [\overline X \right] \geq t \right) &\leq \exp \left(-\frac{2nt^2}{\sum_{i=1}^n (b_i - a_i)^2} \right) \\ \operatorname{P} \left(\left |\overline X - \mathrm{E}\left [\overline X \right] \right | \geq t \right) &\leq 2\exp \left(-\frac{2nt^2}{\sum_{i=1}^n(b_i - a_i)^2} \right) \end{align}$$ I'm pretty sure that the correct versions have $$n^2$$ in the exponential. I've edited the article accordingly.

[no title]
As stated, the inequality is not true (this is easy to see for n=1). There must be some other condition. coco


 * I noticed that you added the condition $$t>0$$, which is indeed required. Is there another condition we're missing? --MarkSweep 21:17, 21 August 2005 (UTC)


 * I suggest that $$X_i \in [0, 1] $$194.25.15.170 (talk) 13:45, 16 March 2015 (UTC)

Almost ten years later, the stated inequality is still obviously wrong, e.g. with odd $n$ and a fair 0-1-coin. Then the probability that the (integer) sum deviates from the (half-integer) expected value by at least 1/2 is clearly 1, whereas the right hand side with $t=1/2$ is strictly smaller.--79.245.218.172 (talk) 05:58, 17 May 2024 (UTC)


 * 79.245.218.172, could you elaborate? I've just had a look and I cannot see why the inequality would be "obviously" wrong. In particular, with your example the right-hand side is $$2 \exp(-\tfrac{1}{2n}),$$ which is greater than 1. Malparti (talk) 11:22, 17 May 2024 (UTC)

Hoeffding's theorem 2
I just reverted an edit which (re)introduced a mistake in the presentation of the inequality. As stated here, the inequality involves the probability


 * $$\Pr(S - \mathrm{E}[S] \geq nt). \qquad\textbf{correct}$$

Note that S is the sum of n independent random variables. This probability could also be written as


 * $$\Pr(S/n - \mathrm{E}[S/n] \geq t), \qquad\textbf{correct}$$

which is how it appears in Hoeffding's paper (Theorem 2, p. 16, using slightly different notation). In other words, Hoeffding's formulation is in terms of the mean of n independent RVs, whereas the formulation used here is in terms of their sum. A recent edit changed this to


 * $$\Pr(S - \mathrm{E}[S] \geq t), \qquad\textbf{wrong}$$

which is incorrect. --MarkSweep (call me collect) 18:50, 24 November 2005 (UTC)

Why do we need that X_i's have finite first and second moments? It is not stated in the Hoefdding paper and after all, I think it follows from that X_i lies in [a_i, b_i] i.e. bounded interval.

The article has no mistakes - the condition t>0 is not nessecary and the last comment is obvious. Nevertheless, by a Hoeffding type inequality is meant an inequality, which uses the transform f(t)=exp(h(t-x)) to derive bounds for tail probabilities for sums of independent r.v. and martingales. Therefore, the written inequality is only one of the inequalities Hoeffding introduced, but, regarding it from a statistical point of view, that is not the most important result as it doesn't control the variance of the variables. I would suggest to add the other inequality and explain what is meant by saying "Hoeffding inequality", as it is not one thing.

I added the restriction $$ t > 0 $$ to the statement of the first inequality in the General case section mostly because that restriction is in the 1963 Hoeffding article. (I think that the restriction is necessary also.)  -- Hein Aug 2017  — Preceding unsigned comment added by Irchans (talk • contribs) 14:32, 3 August 2017 (UTC)

Special case of Bernstein's inequality
Can someone point out which Bernstein inequality Hoeffding's inequality is a special case of?--Steve Kroon 14:10, 22 May 2007 (UTC)

The "which inequality is a special case of the other" is inconsistent in wikipeida: the article on Hoeffding ineq. says it's more general than Bernstein ineq. While the article on Bernstein ineq. says "special cases of Bernstein ineq. is ... Hoeffding ineq". Someone, please, fix this? -- 194.126.99.204 (talk) 13:11, 14 June 2012 (UTC)

Bernstein is typically a stronger inequality, but they are generally incomparable. For sums of (unbiased) Rademachers, Hoeffding correctly asserts the behavior of \exp(-c*x^2) while Bernstein incorrectly gives \exp(-c*x) for large x. On the other hand, the sum of N [-1,1] variables with very small variance, Bernstein will give much tighter \exp(-c*x^2/Variance) for small x over the Hoeffding's \exp(-c*x^2/N). I rephrased it. --TheZuza777 (talk) 00:22, 30 September 2018 (UTC)

Hoeffding's Inequality
Existing version is unreadable, Article ignores dependence of results on Bennett's Inequality —Preceding unsigned comment added by 122.106.62.116 (talk) 06:13, 27 February 2009 (UTC)

Variable Bounds
The bounds for variables X_i seem misleading


 * $$\Pr(X_i-\mathrm{E}[X_i] \in [a_i, b_i]) = 1. \!$$

Older revisions of this article (pre 2009), as well as the original paper state the bounds as


 * $$ a_i \le X_i \le b_i. \!$$

or


 * $$\Pr(X_i \in [a_i, b_i]) = 1. \!$$

--18.111.112.208 (talk) 18:00, 11 December 2010 (UTC)
 * It does not make any difference (I mean the first and the third version above, the second one is marginally stronger). What is misleading about it?—Emil J. 13:52, 13 December 2010 (UTC)


 * I think that the third version deals with centered versions of variable? 194.25.15.170 (talk) 13:43, 16 March 2015 (UTC) Sergey

Problem with bounds
To apply Hoeffding Lemma to the zero-mean variables $$ Y_i=X_i-\mathrm{E}[X_i] $$ ,we must have $$ Y_i \in [a_i, b_i] $$ almost surely.

But, $$ Pr(X_i-\mathrm{E}[X_i] \in [a_i, b_i]) = 1$$ and $$ Pr(X_i \in [a_i, b_i]) = 1 $$ are not equivalent.

Central limit theorem
Can someone elaborate on the relationship to the CLT? As I understand this, these two theorems are closely related. Thanks. --138.246.2.177 (talk) 10:04, 3 December 2012 (UTC)

What is the inequality?
I'm having a hard time finding the thing called "Hoeffding's Inequality". Should that be at the top somewhere? 24.10.157.3 (talk) 00:06, 23 December 2021 (UTC)


 * I've moved things around, it should be more clear now. Fawly (talk) 05:54, 23 December 2021 (UTC)