Principle of marginality

In statistics, the principle of marginality is the fact that the average (or main) effects of variables in an analysis are marginal to their interaction effect—that is, the main effect of one explanatory variable captures the effect of that variable averaged over all values of a second explanatory variable whose value influences the first variable's effect. The principle of marginality implies that, in general, it is wrong to test, estimate, or interpret main effects of explanatory variables where the variables interact or, similarly, to model interaction effects but delete main effects that are marginal to them. While such models are interpretable, they lack applicability, as they ignore the dependence of a variable's effect upon another variable's value.

Nelder and Venables have argued strongly for the importance of this principle in regression analysis.

Regression form
If two independent continuous variables, say x and z, both influence a dependent variable y, and if the extent of the effect of each independent variable depends on the level of the other independent variable then the regression equation can be written as:


 * $$y_i=a+bx_i+cz_i+d(x_iz_i)+ e_i,$$

where i indexes observations, a is the intercept term, b, c, and d are effect size parameters to be estimated, and e is the error term.

If this is the correct model, then the omission of any of the right-side terms would be incorrect, resulting in misleading interpretation of the regression results.

With this model, the effect of x upon y is given by the partial derivative of y with respect to x; this is $$b+dz_i$$, which depends on the specific value $$z_i$$ at which the partial derivative is being evaluated. Hence, the main effect of x – the effect averaged over all values of z – is meaningless as it depends on the design of the experiment (specifically on the relative frequencies of the various values of z) and not just on the underlying relationships. Hence:


 * In the case of interaction, it is wrong to try to test, estimate, or interpret a "main effect" coefficient b or c, omitting the interaction term.

In addition:


 * In the case of interaction, it is wrong to not include b or c, because this will give incorrect estimates of the interaction.