Tukey's test of additivity

In statistics, Tukey's test of additivity, named for John Tukey, is an approach used in two-way ANOVA (regression analysis involving two qualitative factors) to assess whether the factor variables (categorical variables) are additively related to the expected value of the response variable. It can be applied when there are no replicated values in the data set, a situation in which it is impossible to directly estimate a fully general non-additive regression structure and still have information left to estimate the error variance. The test statistic proposed by Tukey has one degree of freedom under the null hypothesis, hence this is often called "Tukey's one-degree-of-freedom test."

Introduction
The most common setting for Tukey's test of additivity is a two-way factorial analysis of variance (ANOVA) with one observation per cell. The response variable Yij is observed in a table of cells with the rows indexed by i = 1,..., m and the columns indexed by j = 1,..., n. The rows and columns typically correspond to various types and levels of treatment that are applied in combination.

The additive model states that the expected response can be expressed EYij = μ + αi + βj, where the αi and βj are unknown constant values. The unknown model parameters are usually estimated as



\widehat{\mu} = \bar{Y}_{\cdot\cdot} $$



\widehat{\alpha}_i = \bar{Y}_{i\cdot} - \bar{Y}_{\cdot\cdot} $$



\widehat{\beta}_j = \bar{Y}_{\cdot j} - \bar{Y}_{\cdot\cdot} $$

where Y i• is the mean of the ith row of the data table, Y •j is the mean of the jth column of the data table, and Y •• is the overall mean of the data table.

The additive model can be generalized to allow for arbitrary interaction effects by setting EYij = μ + αi + βj + γij. However, after fitting the natural estimator of γij,



\widehat{\gamma}_{ij} = Y_{ij} - (\widehat{\mu} + \widehat{\alpha}_i + \widehat{\beta}_j), $$

the fitted values



\widehat{Y}_{ij} = \widehat{\mu} + \widehat{\alpha}_i + \widehat{\beta}_j + \widehat{\gamma}_{ij} \equiv Y_{ij} $$

fit the data exactly. Thus there are no remaining degrees of freedom to estimate the variance σ2, and no hypothesis tests about the γij can performed.

Tukey therefore proposed a more constrained interaction model of the form



\operatorname{E} Y_{ij} = \mu + \alpha_i + \beta_j + \lambda\alpha_i\beta_j $$

By testing the null hypothesis that λ = 0, we are able to detect some departures from additivity based only on the single parameter λ.

Method
To carry out Tukey's test, set



SS_A \equiv n \sum_{i} (\bar{Y}_{i \cdot}-\bar{Y}_{\cdot\cdot})^2 $$



SS_B \equiv m \sum_{j} (\bar{Y}_{\cdot j} - \bar{Y}_{\cdot\cdot})^2 $$



SS_{AB} \equiv \frac{(\sum_{ij} Y_{ij}(\bar{Y}_{i\cdot}-\bar{Y}_{\cdot\cdot})(\bar{Y}_{\cdot j}-\bar{Y}_{\cdot\cdot}))^2}{\sum_{i} (\bar{Y}_{i \cdot}-\bar{Y}_{\cdot\cdot})^2 \sum_{j} (\bar{Y}_{\cdot j} - \bar{Y}_{\cdot\cdot})^2} $$



SS_T \equiv \sum_{ij} (Y_{i j} - \bar{Y}_{\cdot\cdot})^2 $$



SS_E \equiv SS_T - SS_A - SS_B - SS_{AB} $$

Then use the following test statistic



\frac{SS_{AB}/1}{MS_E}. $$

Under the null hypothesis, the test statistic has an F distribution with 1, q degrees of freedom, where q = mn &minus; (m + n) is the degrees of freedom for estimating the error variance.