User:Talgalili/sandbox/Newman–Keuls method

The Newman–Keuls or Student–Newman–Keuls (SNK) method is a stepwise multiple comparisons procedure used to identify sample means that are significantly different from each other. It was named after Student (1927), D. Newman, and M. Keuls. This procedure is often used as a post-hoc test whenever a significant difference between three or more sample means has been revealed by an analysis of variance (ANOVA). The Newman–Keuls method is parallel to Tukey's range test as both procedures use Studentized range statistics ,and both procedures are designed to test all pairwise comparisons. Compared to Tukey's range test, the Newman–Keuls method is more powerful (meaning, we'll expect it to declare more differences between groups significant) but less conservative (meaning we'll also expect it to make more type I errors and wrongly declare a difference between two groups significant).

Historical perspective
The method was originally introduced by Newman in 1939, and further developed by Keuls in 1952, before Tukey presented the concept of different types of multiple error rates(1952a, 1952b ,1953 ). The method was popular during 1950's and 1960's, but when the control of Familywise error rate (FWER) was accepted as an appropriate criterion in multiple comparison testing, SNK which does not control FWER (except for the special case of exactly three groups ) became somewhat forgotten. In 1995 Benjamini and Hochberg presented a new, more liberal and more powerful criterion for those types of problems: False discovery rate (FDR) control. In 2006, Shaffer showed (by extensive simulation) that the Newman-Keuls procedure controls the FDR with some constrains.

Required assumptions
The assumptions of the Newman-Keuls test are essentially the same as for an independent groups t-test: normality, homogeneity of variance, and independent observations. The test is quite robust to violations of normality. Violating homogeneity of variance can be more problematical than in the two-sample case since the MSE is based on data from all groups. The assumption of independence of observations is important and should not be violated.

Procedure
The Newman–Keuls method employs a stepwise approach when comparing sample means. Prior to any mean comparison, all sample means are rank-ordered in ascending or descending order, thereby producing an ordered range of sample means. A comparison is then made between the largest and smallest sample means within the largest range. Assuming that the largest range is four means, a significant difference between the largest and smallest means as revealed by the Newman–Keuls method would result in a rejection of the null hypothesis. The next largest comparison of two sample means would then be made within a smaller range of three means. Unless there is no significant differences between two sample means within any given range, this step-wise comparison of sample means will continue until a final comparison is made with the smallest range of just two means. If there is no significant difference between the two sample means, then all the null hypotheses within that range would be retained and no further comparisons within smaller ranges are necessary.

To determine if there is a significant difference between two means, the Newman–Keuls method uses a the test statistic called the "Studentized Range" which is abbreviated "q" and is identical to the one used in Tukey's range test. The q value is calculated by dividing the pairwise difference between means by a standard error based upon the average variance of the two samples being considered

$$ q=\frac{\overline{X_{A}}-\overline{X_{B}}}{S_{AB}}$$

where $${S_{AB}}$$ is calculated as follows:


 * 1) When the sample sizes are equal: $$n_{A}=n_{B}=n;\quad S_{AB}=\sqrt{\frac{MSE}{n}}$$
 * 2) When the sample sizes are not equal: $${n_A}\neq{n_B}: S_{AB}=\sqrt{\frac{MSE}{2}\left(\frac{1}{n_{A}}+\frac{1}{n_{B}}\right)}$$

On both cases, MSE (Mean squared error) is taken from the ANOVA conducted in the first stage of the analysis.

The computed q value is then compared to a q critical value taken from a q distribution table. If the computed q value is equal to or greater than the q critical value, then the null hypothesis (H0: μA = μB) can be rejected.

A simple example
Assume we used an ANOVA to look for differences in the means of five equal size groups $${n_i}=11$$. We found a significant difference and we now want to proceed and see among which of the groups there are significant differences ($$\alpha=0.05$$) We'll use the estimated means and MSE from the ANOVA:

$$MSE=53;\quad\alpha=0.05;\quad n_{i}=11;\quad i=1,2..5;\quad df=50$$

The chart below demonstrates the order, in which the null hypotheses are tested, the critical q-values which appear in the right side of the chart changes between steps (due to the decrease in the number of groups). Whenever the computed q-value is higher than the critical value we reject the tested hypothesis and continue to the next step.



Main advantages

 * 1) The main advantage of the SNK method is its' relatively high power.
 * 2) Since the SNK method is a stepwise procedure its' computational effort is substantially lower than Tukey's.

Main flaws

 * 1) The Newman-Keuls procedure cannot produce an α% confidence interval for each mean difference, or for multiplicity adjusted exact p-values due to its' sequential nature.
 * 2) Results are somewhat difficult to interpret since it is difficult to articulate what are the null hypothesis that were tested.
 * 3) For more conservative purposes, SNK will be considered worse than Tukey's procedure as it does not control FWER, and it only controls FDR under certain conditions.