Newey–West estimator

A Newey–West estimator is used in statistics and econometrics to provide an estimate of the covariance matrix of the parameters of a regression-type model where the standard assumptions of regression analysis do not apply. It was devised by Whitney K. Newey and Kenneth D. West in 1987, although there are a number of later variants. The estimator is used to try to overcome autocorrelation (also called serial correlation), and heteroskedasticity in the error terms in the models, often for regressions applied to time series data. The abbreviation "HAC," sometimes used for the estimator, stands for "heteroskedasticity and autocorrelation consistent." There are a number of HAC estimators described in, and HAC estimator does not refer uniquely to Newey–West. One version of Newey–West Bartlett requires the user to specify the bandwidth and usage of the Bartlett kernel from Kernel density estimation

Regression models estimated with time series data often exhibit autocorrelation; that is, the error terms are correlated over time. The heteroscedastic consistent estimator of the error covariance is constructed from a term $$X^{\operatorname{T}}\Sigma X$$, where $$X$$ is the design matrix for the regression problem and $$\Sigma$$ is the covariance matrix of the residuals. The least squares estimator $$b$$ is a consistent estimator of $$\beta$$. This implies that the least squares residuals $$e_i$$ are "point-wise" consistent estimators of their population counterparts $$E_i$$. The general approach, then, will be to use $$X$$ and $$e$$ to devise an estimator of $$X^{\operatorname{T}}\Sigma X$$. This means that as the time between error terms increases, the correlation between the error terms decreases. The estimator thus can be used to improve the ordinary least squares (OLS) regression when the residuals are heteroskedastic and/or autocorrelated.


 * $$ X^{\operatorname{T}}\Sigma X=\frac{1}{T} \sum^T_{t=1} e_t^2 x_t x^{\operatorname{T}}_t + \frac{1}{T} \sum^L_{\ell=1} \sum^T_{t=\ell+1} w_\ell e_t e_{t-\ell}(x_t x^{\operatorname{T}}_{t-\ell} + x_{t-\ell} x^{\operatorname{T}}_t) $$


 * $$w_\ell=1 - \frac\ell {L+1}$$

where T is the sample size, $$e_t$$ is the $$t^\text{th}$$ residual and $$x_t$$ is the $$t^\text{th}$$ row of the design matrix, and $$ w_\ell $$ is the Bartlett kernel and can be thought of as a weight that decreases with increasing separation between samples. Disturbances that are farther apart from each other are given lower weight, while those with equal subscripts are given a weight of 1. This ensures that second term converges (in some appropriate sense) to a finite matrix. This weighting scheme also ensures that the resulting covariance matrix is positive semi-definite. L = 0 reduces the Newey–West estimator to Huber–White standard error. L specifies the "maximum lag considered for the control of autocorrelation. A common choice for L" is $$T^{1/4}$$.

Software implementations
In Julia, the CovarianceMatrices.jl package supports several types of heteroskedasticity and autocorrelation consistent covariance matrix estimation including Newey–West, White, and Arellano.

In R, the packages and  include a function for the Newey–West estimator.

In Stata, the command  produces Newey–West standard errors for coefficients estimated by OLS regression.

In MATLAB, the command  in the Econometrics toolbox produces the Newey–West estimator (among others).

In Python, the module includes functions for the covariance matrix using Newey–West.

In Gretl, the option  to several estimation commands (such as  ) in the context of a time-series dataset produces Newey–West standard errors.

In SAS, the Newey–West corrected standard errors can be obtained in PROC AUTOREG and PROC MODEL