Generated regressor

In least squares estimation problems, sometimes one or more regressors specified in the model are not observable. One way to circumvent this issue is to estimate or generate regressors from observable data. This generated regressor method is also applicable to unobserved instrumental variables. Under some regularity conditions, consistency and asymptotic normality of least squares estimator is preserved, but asymptotic variance has a different form in general.

Suppose the model of interest is the following:


 * $$y_{i}=g(x_{1i},x_{2i},\beta)+u_{i}$$

where g is a conditional mean function and its form is known up to finite-dimensional parameter β. Here $$x_{2i}$$ is not observable, but we know that $$x_{2i}=h(w_{i},\gamma)$$ for some function h known up to parameter $$\gamma$$, and a random sample $$y_{i}=g(x_{1i},x_{2i},\beta)+u_{i}$$ is available. Suppose we have a consistent estimator $$\hat\gamma $$ of $$\gamma$$ that uses the observation $$w_{i}$$'s. Then, β can be estimated by (Non-Linear) Least Squares using $$\hat{x_{2i}}=h(w_{i},\hat\gamma)$$. Some examples of the above setup include Anderson et al. (1976 and Barro (1977).

This problem falls into the framework of two-step M-estimator and thus consistency and asymptotic normality of the estimator can be verified using the general theory of two-step M-estimator. As in general two-step M-estimator problem, asymptotic variance of a generated regressor estimator is usually different from that of the estimator with all regressors observed. Yet, in some special cases, the asymptotic variances of the two estimators are identical. To give one such example, consider the setting in which the regression function is linear in parameter and unobserved regressor is a scalar. Denoting the coefficient of unobserved regressor by $$\delta$$ if $$\delta=0$$ and $$E[\triangledown\gamma h(W,\gamma) U]=0$$ then the asymptotic variance is independent of whether observing the regressor.

With minor modifications in the model, the above formulation is also applicable to Instrumental Variable estimation. Suppose the model of interest is linear in parameter. Error term is correlated with some of the regressors, and the model specifies some instrumental variables, which are not observable but have the representation $$z_{i}=h(w_{i},\gamma)$$. If a consistent estimator of $$\gamma$$ of $$\hat\gamma$$ is available using $$\hat z_{i}= h(w_{i},\hat\gamma)$$ as instruments, the parameter of interest can be estimated by IV. Similar to the above case, consistency and asymptotic normality follows under mild conditions, and the asymptotic variance has a different form than observed IV case. Yet, there are cases in which the two estimators have the same asymptotic variance. One such case occurs if $$E[\triangledown\gamma h(W,\gamma)]=0[4]$$In this special case, inference on the estimated parameter can be conducted with the usual IV standard error estimator.