Information matrix test

In econometrics, the information matrix test is used to determine whether a regression model is misspecified. The test was developed by Halbert White, who observed that in a correctly specified model and under standard regularity assumptions, the Fisher information matrix can be expressed in either of two ways: as the outer product of the gradient, or as a function of the Hessian matrix of the log-likelihood function.

Consider a linear model $$\mathbf{y} = \mathbf{X} \mathbf{\beta} + \mathbf{u}$$, where the errors $$\mathbf{u}$$ are assumed to be distributed $$\mathrm{N}(0, \sigma^2 \mathbf{I})$$. If the parameters $$\beta$$ and $$\sigma^2$$ are stacked in the vector $$\mathbf{\theta}^{\mathsf{T}} = \begin{bmatrix} \beta & \sigma^2 \end{bmatrix}$$, the resulting log-likelihood function is


 * $$\ell (\mathbf{\theta}) = - \frac{n}{2} \log \sigma^2 - \frac{1}{2 \sigma^2} \left( \mathbf{y} - \mathbf{X} \mathbf{\beta} \right)^{\mathsf{T}} \left( \mathbf{y} - \mathbf{X} \mathbf{\beta} \right)$$

The information matrix can then be expressed as


 * $$\mathbf{I} (\mathbf{\theta}) = \operatorname{E} \left[ \left( \frac{\partial \ell (\mathbf{\theta}) }{ \partial \mathbf{\theta} } \right) \left( \frac{\partial \ell (\mathbf{\theta}) }{ \partial \mathbf{\theta} } \right)^{\mathsf{T}} \right]$$

that is the expected value of the outer product of the gradient or score. Second, it can be written as the negative of the Hessian matrix of the log-likelihood function


 * $$\mathbf{I} (\mathbf{\theta}) = - \operatorname{E} \left[ \frac{\partial^2 \ell (\mathbf{\theta}) }{ \partial \mathbf{\theta} \, \partial \mathbf{\theta}^{\mathsf{T}}} \right]$$

If the model is correctly specified, both expressions should be equal. Combining the equivalent forms yields


 * $$\mathbf{\Delta}(\mathbf{\theta}) = \sum_{i=1}^n \left[ \frac{\partial^2 \ell(\mathbf{\theta}) }{ \partial \mathbf{\theta} \, \partial \mathbf{\theta}^{\mathsf{T}} } + \frac{\partial \ell(\mathbf{\theta}) }{ \partial \mathbf{\theta} } \frac{\partial \ell (\mathbf{\theta}) }{ \partial \mathbf{\theta} } \right]$$

where $$\mathbf{\Delta} (\mathbf{\theta})$$ is an $$(r \times r) $$ random matrix, where $$r$$ is the number of parameters. White showed that the elements of $$n^{-1/2} \mathbf{\Delta} ( \mathbf{\hat{\theta}} )$$, where $$\mathbf{\hat{\theta}}$$ is the MLE, are asymptotically normally distributed with zero means when the model is correctly specified. In small samples, however, the test generally performs poorly.