Rule of three (statistics)



In statistical analysis, the rule of three states that if a certain event did not occur in a sample with $n$ subjects, the interval from 0 to 3/$n$ is a 95% confidence interval for the rate of occurrences in the population. When $n$ is greater than 30, this is a good approximation of results from more sensitive tests. For example, a pain-relief drug is tested on 1500 human subjects, and no adverse event is recorded. From the rule of three, it can be concluded with 95% confidence that fewer than 1 person in 500 (or 3/1500) will experience an adverse event. By symmetry, for only successes, the 95% confidence interval is [1−3/$n$,1].

The rule is useful in the interpretation of clinical trials generally, particularly in phase II and phase III where often there are limitations in duration or statistical power. The rule of three applies well beyond medical research, to any trial done $n$ times. If 300 parachutes are randomly tested and all open successfully, then it is concluded with 95% confidence that fewer than 1 in 100 parachutes with the same characteristics (3/300) will fail.

Derivation
A 95% confidence interval is sought for the probability p of an event occurring for any randomly selected single individual in a population, given that it has not been observed to occur in n Bernoulli trials. Denoting the number of events by X, we therefore wish to find the values of the parameter p of a binomial distribution that give Pr(X = 0) ≤ 0.05. The rule can then be derived either from the Poisson approximation to the binomial distribution, or from the formula (1−p)n for the probability of zero events in the binomial distribution. In the latter case, the edge of the confidence interval is given by Pr(X = 0) = 0.05 and hence (1−p)n = .05 so n ln(1–p) = ln .05 ≈ −2.996. Rounding the latter to −3 and using the approximation, for p close to 0, that ln(1−p) ≈ −p (Taylor's formula), we obtain the interval's boundary 3/n.

By a similar argument, the numerator values of 3.51, 4.61, and 5.3 may be used for the 97%, 99%, and 99.5% confidence intervals, respectively, and in general the upper end of the confidence interval can be given as $$\frac{-\ln(\alpha)}{n}$$, where $$1-\alpha$$ is the desired confidence level.

Extension
The Vysochanskij–Petunin inequality shows that the rule of three holds for unimodal distributions with finite variance beyond just the binomial distribution, and gives a way to change the factor 3 if a different confidence is desired. Chebyshev's inequality removes the assumption of unimodality at the price of a higher multiplier (about 4.5 for 95% confidence). Cantelli's inequality is the one-tailed version of Chebyshev's inequality.