User:Makingstats/sandbox

From Wikipedia, the free encyclopedia

Covariance is a measure of how much two variables change together and how strong the relationship is between them.[1] Analysis of covariance (ANCOVA) is a general linear model which blends ANOVA and regression. ANCOVA evaluates whether population means of a dependent variable (DV) are equal across levels of a categorical independent variable (IV), while statistically controlling for the effects of other continuous variables that are not of primary interest, known as covariates (CV). Therefore, when performing ANCOVA, we are adjusting the DV means to what they would be if all groups were equal on the CV.[2]


Uses of ANCOVA[edit]

Increase Power[edit]

ANCOVA can be used to increase statistical power[3] (the ability to find a significant difference between groups when one exists) by reducing the within-group error variance. In order to understand this, it is necessary to understand the test used to evaluate differences between groups, the F-test. The F-test is computed by dividing the explained variance between groups (e.g., gender difference) by the unexplained variance within the groups. Thus,

                F = 

If this value is larger than a critical value, we conclude that there is a significant difference between groups. Unexplained variance includes error variance (e.g., individual differences), as well as the influence of other factors. Therefore, the influence of CVs is grouped in the denominator. When we control for the effect of CVs on the DV, we remove it from the denominator making F larger, thereby increasing your power to find a significant effect if one exists. The image below provides a graphical display of this:

Partitioning variance
Partitioning variance







Adjusting Preexisting Differences[edit]

Another use of ANCOVA is to adjust for preexisting differences in nonequivalent (intact) groups. This controversial application aims at correcting for initial group differences (prior to group assignment) that exists on DV among several intact groups. In this situation, participants cannot be made equal through random assignment, so CVs are used to adjust scores and make participants more similar than without the CV. However, even with the use of covariates, there are no statistical techniques that can equate unequal groups. Furthermore, the CV may be so intimately related to the IV that removing the variance on the DV associated with the CV would remove considerable variance on the DV, rendering the results meaningless.[4]


Assumptions of ANCOVA[edit]

There are four assumptions that underlie the use of ANCOVA and affect interpretation of the results:

Assumption 1: Randomness and Independent Sampling[edit]

Observations must be randomly sampled from the population and independent from each other. If this assumption is violated, the test will produce inaccurate results.

Assumption 2: Normality[edit]

There must be a normal distribution of the DV in the population. In the event that a distribution that is nonnormal (e.g., skewed or kurtotic) and sample sizes are small, p-values may be invalid.[5]

Assumption 3: Homogeneity of Variances[edit]

The variances of the DV must be equal for all levels of the IV and the CV.

Assumption 4: Homogeneity of Regression Slopes[edit]

The slope of the line predicting the DV from the CV must be equal for each level of the IV. That is, the CV must not have differential effects on the DV at different levels of the IV. This assumption is violated when there is a significant interaction between the IV and the CV.
If this assumption is violated, ANCOVA should not be performed.[6] If the correlations of the covariates with the DV are very different in different cells of the design, gross misinterpretations of results may occur. In ANCOVA, we basically perform a regression analysis within each cell to partition out the variance component due to the CV. The homogeneity of slopes assumption implies that we perform this regression analysis subject to the constraint that all regression equations (slopes) across the cells of the design are the same. If this is not the case, serious biases may occur.


Conducting an ANCOVA[edit]

Test Multicollinearity[edit]

If a CV is highly related to another CV (at a correlation of .5 or more), then it will not adjust the DV over and above the other CV. One or the other should be removed since they are statistically redundant.

Test the Homogeneity of Variance Assumption[edit]

Tested by Levene's test of equality of error variances. This is most important after adjustments have been made, but if you have it before adjustment you are likely to have it afterwards.

Test the Homogeneity of Regression Slopes Assumption[edit]

To see if the CV significantly interacts with the IV, run an ANCOVA model including both the IV and the CVxIV interaction term. If the CVxIV interaction is significant, ANCOVA should not be performed. Instead, Green & Salkind[5] suggest assessing group differences on the DV at particular levels of the CV. Also consider using a moderated regression analysis, treating the CV and its interaction as another IV. Alternatively, one could use mediation analyses to determine if the CV accounts for the IV’s effect on the DV.

Run ANCOVA Analysis[edit]

If the CVxIV interaction is not significant, rerun the ANCOVA without the CVxIV interaction term. In this analysis, you need to use the adjusted means and adjusted MSerror. The adjusted means refer to the group means after controlling for the influence of the CV on the DV.

Follow-up Analyses[edit]

If there was a significant main effect, it means that there is a significant difference between the levels of one IV, ignoring all other factors.[1] To find exactly which levels are significantly different from one another, one can use the same follow-up tests as for the ANOVA. If there are two or more IVs, there may be a significant interaction, which means that the effect of one IV on the DV changes depending on the level of another factor. One can investigate the simple main effects using the same methods as in a factorial ANOVA.


References[edit]

  1. ^ a b Howell, D. C. (2009) Statistical methods for psychology (7th ed.). Belmont: Cengage Wadsworth.
  2. ^ Keppel, G. (1991). Design and analysis: A researcher's handbook (3rd ed.). Englewood Cliffs: Prentice-Hall, Inc.
  3. ^ Tabachnick, B. G., & Fidell, L. S. (2007). Using Multivariate Statistics (5th ed.). Boston: Pearson Education, Inc.
  4. ^ Miller, G. A., & Chapman, J. P. (2001). Misunderstanding Analysis of Covariance. Journal of Abnormal Psychology, 110 (1), 40-48.
  5. ^ a b Green, S. B., & Salkind, N. J. (2011). Using SPSS for Windows and Macintosh: Analyzing and Understanding Data (6th ed.). Upper Saddle River, NJ: Prentice Hall.
  6. ^ Engqvist, L. (2005). The mistreatment of covariate interaction terms in linear model analyses of behavioural and evolutionary ecology studies. Animal Behaviour, 70, 967–971.