Talk:Tukey's range test

combine pages
Good idea to combine this page (more readable) with the similar highlighted at top. I'm not a biometrician so I'll stuff it up if I attempt such editing. Cheers --Jppigott (talk) 02:31, 21 April 2009 (UTC)

Need an example
Several of the stats pages (e.g. f-test) have very helpful examples - maybe this one could also have an example? Maybe continue the example provided on the f-test page? Thanks! Mike —Preceding unsigned comment added by 129.215.197.194 (talk) 10:05, 11 February 2010 (UTC)


 * I echo this suggestion. —DIV (137.111.13.4 (talk) 00:32, 18 June 2015 (UTC))

Inconsistent notation
The notation in the equations given is incomplete and inconsistent. Lower-case n is first used as the number of groups/means to be compared, but in the formula for the confidence interval lower-case r is used instead. Upper-case N is used in the CI formula without reference. Lower-case n is used in a different sense from the onee above; I assume that here it is the (common) sample size of the groups.

It might also be useful to unpack "the standard deviation of the entire design" and "the degrees of freedom for the whole design", for people who did not arrive at this page from somewhere that defines those things. DMTate (talk) 14:36, 3 June 2011 (UTC)

assumptions?
The test statistic section says: "This gives rise to the normality assumption of Tukey's test." But normality is not listed in the "Assumptions" section. Should it be? — Preceding unsigned comment added by 129.137.189.234 (talk) 21:38, 4 April 2012 (UTC)


 * I came here to suggest the same thing.94.219.218.221 (talk) 16:10, 24 September 2012 (UTC)
 * According to, there are five assumptions:
 * Each sample was obtained using random sampling
 * Each sample consists of independent observations within and among the other samples
 * Each population is normally distributed
 * Each population have equal variances
 * At least one pair of populations have different means (as decided by e.g. one-way ANOVA)
 * 94.219.218.221 (talk) 16:15, 24 September 2012 (UTC)

Confidence limits
Please introduce $$\bar{y}_{i\bullet}$$, $$\bar{y}_{j\bullet}$$ and $$q_{\alpha;r;N-r}$$ before use.

For an user without strong mathematic knowledge (like me) it is not evident why the notation of the means is changed from $$Y$$ to $$\bar{y}$$, and the group indecies from $$A$$ and $$B$$ to $$i\bullet$$ and $$j\bullet$$.

Further the index of $$q$$ is confusing (because its long) and an explanation would be helpful!

Thanks a lot! — Preceding unsigned comment added by 77.12.234.71 (talk) 12:41, 31 October 2012 (UTC)

Should the section "Order of comparisons" be removed or edited?
Here is the section - what do you think? (in the article it was removed, but maybe it should be edited and then added again?)

Order of comparisons
If there are a set of means (A, B, C, D), which can be ranked in the order A > B > C > D, not all possible comparisons need be tested using Tukey's test. To avoid redundancy, one starts by comparing the largest mean (A) with the smallest mean (D). If the qs value for the comparison of means A and D is less than the q value from the distribution, the null hypothesis is not rejected, and the means are said have no statistically significant difference between them. Since there is no difference between the two means that have the largest difference, comparing any two means that have a smaller difference is assured to yield the same conclusion (if sample sizes are identical). As a result, no other comparisons need to be made.

Overall, it is important when employing Tukey's test to always start by comparing the largest mean to the smallest mean, and then the largest mean with the next smallest, etc., until the largest mean has been compared to all other means (or until no difference is found). After this, compare the second largest mean with the smallest mean, and then the next smallest, and so on. Once again, if two means are found to have no statistically significant difference, do not compare any of the means between them.

Tal Galili (talk) 14:19, 20 October 2014 (UTC)

Additions to the article
I have now finished introducing major additions to the article (see the following diff: https://en.wikipedia.org/w/index.php?title=Tukey%27s_range_test&diff=630377644&oldid=606375816 ). These are based on the work done by students in the Tel-Aviv University course "multiple comparisons". You may see a relevant revision history here: https://en.wikipedia.org/w/index.php?title=User:Talgalili/sandbox/Tukey%27s_range_test&action=history

Tal Galili (talk) 14:28, 20 October 2014 (UTC)

"Honest significant difference" or "honestly significant difference"?
Question in the headline. Sigma^2 (talk) 15:21, 7 October 2015 (UTC)

Whether this is a post-hoc test
The following text ist marked as "citation needed": "A common mistaken belief is that Tukey's HSD should only be used following a significant ANOVA. The ANOVA is not necessary because the Tukey test controls the Type I error rate on its own." I am not a statistician, but I googled "Tukey HSD without anova" and found this essay by David M. Lane https://davidmlane.com/hyperstat/essays/tukey_test.html. David M. Lane does exist, although I found no link from his official home page to his private homepage with the article. I cannot judge the validity of the article but it lists two sources. Could anyone confirm that this is legit?

2A01:C22:D5DE:6700:38A0:9849:C7D:A24C (talk) 12:30, 4 July 2023 (UTC)


 * In my experience, this is not a mainstream perception, e.g. Sheskin, D. J. (2011). Handbook of Parametric and Nonparametric Statistical Procedures (5th ed.). Boca Raton: Chapman & Hall / CRC page 895. Typically, an ANOVA is run as a single test to analyze an experiment where the outcomes are not known a priori. If significance is found, that only tells the researcher that at least one of the groups or combinations is significantly different, but it doesn't tell which one. The next logical step is to figure out which ones are different from which other ones.
 * Sheskin describes Tukey as one of a number of unplanned comparisons that are done once significance is detected via ANOVA, aka post-hoc.
 * As Lane says, you could decide to use Tukey a priori, but you would limit your options if you did. If ANOVA indicates a difference, perhaps you don't have to contrast every possible combination to answer the research question, and could instead perform targeted contrasts using Bonferroni to control alpha inflation and preserve power. Once you do Tukey though, you have just contrasted every possible combination, which may be far too conservative to answer the research question and could easily miss significant differences.
 * I would say Lane's article is not authoritative, is not representative of common practice and seriously underestimates the problem of committing to test all possible combinations, especially with multiple levels and factors. Sheskin's book, on the other hand, is the definitive resource of statistics procedures. I will "be bold" and delete that line. SteveOuellette (talk) 21:07, 3 November 2023 (UTC)