Park test

In econometrics, the Park test is a test for heteroscedasticity. The test is based on the method proposed by Rolla Edward Park for estimating linear regression parameters in the presence of heteroscedastic error terms.

Background
In regression analysis, heteroscedasticity refers to unequal variances of the random error terms $$\epsilon_i$$, such that


 * $$\operatorname{Var}(\epsilon_i)=E(\epsilon_i^2)-E(\epsilon_i)^2=E(\epsilon_i^2)=\sigma_i^2$$.

It is assumed that $$\operatorname{E}(\epsilon_i)=0$$. The above variance varies with $$i$$, or the $$i^{th}$$ trial in an experiment or the $$i^{th}$$case or observation in a dataset. Equivalently, heteroscedasticity refers to unequal conditional variances in the response variables $$Y_i$$, such that


 * $$\operatorname{Var}(Y_i|X_i)=\sigma_i^2$$,

again a value that depends on $$i$$ – or, more specifically, a value that is conditional on the values of one or more of the regressors $$X$$. Homoscedasticity, one of the basic Gauss–Markov assumptions of ordinary least squares linear regression modeling, refers to equal variance in the random error terms regardless of the trial or observation, such that


 * $$\operatorname{Var}(\epsilon_i)=\sigma^2$$, a constant.

Test description
Park, on noting a standard recommendation of assuming proportionality between error term variance and the square of the regressor, suggested instead that analysts 'assume a structure for the variance of the error term' and suggested one such structure:


 * $$\operatorname{ln}(\sigma_{\epsilon i}^2)=\operatorname{ln}(\sigma^2)=\gamma\operatorname{ln}(X_i)+v_i$$

in which the error terms $$v_i$$ are considered well behaved.

This relationship is used as the basis for this test.

The modeler first runs the unadjusted regression


 * $$Y_i=\beta_0+\beta_1X_{i1}+...+\beta_{p-1}X_{i,p-1}+\epsilon_i$$

where the latter contains p − 1 regressors, and then squares and takes the natural logarithm of each of the residuals ($$\hat{\epsilon_i}$$), which serve as estimators of the $$\epsilon_i$$. The squared residuals $$\hat{\epsilon_i}^2$$ in turn estimate $$\sigma_{\epsilon i}^2$$.

If, then, in a regression of $$\ln{(\epsilon_i^2)}$$ on the natural logarithm of one or more of the regressors $$X_i$$, we arrive at statistical significance for non-zero values on one or more of the $$\hat\gamma_i$$, we reveal a connection between the residuals and the regressors. We reject the null hypothesis of homoscedasticity and conclude that heteroscedasticity is present.