Wikipedia:Reference desk/Archives/Mathematics/2011 July 24

= July 24 =

Continuity Correction
Hi everyone, I write here regarding to Normal Distribution theory issue. It states that Normal Distribution can be used as an alternative to Binomial Distribution in case number of trials n is large (n>50), and mean np>5 then a discrete random variable of the Binomial Distribution can be approximated by a continuous random variable of the Normal Distribution through Continuity Correction:

If  P(X=A)(discrete) then P(A – 0.5 < X < A + 0.5) (continuous)

If  P(X>A) (discrete) then P(X > A + 0.5) (continuous)

If  P(X≤A) (discrete) then P(X < A + 0.5) (continuous)

If   P (X A – 0.5) (continuous)

And my question is that, in the 5 statements above, what kinds of principle have been utilised to build up such statements? Is it possible to prove them? Or those are solely consequences from practical problems? Thanks in advance.Torment273 (talk) 11:18, 24 July 2011 (UTC)
 * The first thing to realize is that all these statements are really the same. First let's improve the notation and let X be the binomial variable and Y be a normal variable approximating it. Then the first statement becomes $$\mathrm{Pr}(X=A)\approx\mathrm{Pr}(A-0.5A) = \mathrm{Pr}(X=A+1)+\mathrm{Pr}(X=A+2)+\mathrm{Pr}(X=A+3)+\cdots \approx$$
 * $$\approx\mathrm{Pr}(A+0.5A+0.5)$$.
 * Now, the rule $$\mathrm{Pr}(X=A)\approx\mathrm{Pr}(A-0.5<Y<A+0.5)$$ is a result of how we choose Y, and we choose it this way because we want Y to be as close to X as possible. Since A is an integer, for Y in the range $$A-0.5<Y<A+0.5$$ the closest integer is A. -- Meni Rosenfeld (talk) 11:41, 24 July 2011 (UTC)


 * Amazing. Thanks for a quick reply. Have you worked it out on your own?Torment273 (talk) 13:24, 24 July 2011 (UTC)
 * Yes. Actually to me it's clear why these rules are valid and how to show they're all equivalent. Hopefully with experience it will become as obvious to you as well. It's good that you're asking these questions instead of taking things for granted. -- Meni Rosenfeld (talk) 14:25, 24 July 2011 (UTC)

To make this into five separate statements is to make a simple thing extremely complicated. Just remember that for integer-valued variables,
 * X < 6

and
 * X &le; 5

are the same thing and the continuity correction uses the value half-way between them. And similarly
 * X > 8

and
 * X &ge; 9

are the same thing. Michael Hardy (talk) 04:24, 25 July 2011 (UTC)