Truncated normal hurdle model

In econometrics, the truncated normal hurdle model is a variant of the Tobit model and was first proposed by Cragg in 1971.

In a standard Tobit model, represented as $$y=(x\beta+u) 1[x\beta+u>0]$$, where $$u|x\sim N(0,\sigma^2)$$This model construction implicitly imposes two first order assumptions:


 * 1) Since:  $$\partial P[y>0]/\partial x_j=\varphi (x\beta/\sigma) \beta_j/\sigma$$ and $$\partial \operatorname E[y\mid x,y >0]/\partial x_j=\beta_j\{1-\theta(x\beta/\sigma\}$$, the partial effect of $$x_j$$ on the probability $$P[y>0]$$ and the conditional expectation: $$\operatorname E [y\mid x,y >0]$$ has the same sign:
 * 2) The relative effects of $$x_h$$ and $$x_j$$ on $$P[y>0]$$ and $$\operatorname E [y\mid x,y>0]$$ are identical, i.e.:
 * $$\frac{\partial P[y>0]/\partial x_h}{\partial P[y>0]/ \partial x_j}=\frac{\partial \operatorname E[y\mid x,y>0]/ \partial x_h}{ \partial \operatorname E[y\mid x,y>0]/ \partial x_j} = \frac{\beta_h}{\beta_j}|$$

However, these two implicit assumptions are too strong and inconsistent with many contexts in economics. For instance, when we need to decide whether to invest and build a factory, the construction cost might be more influential than the product price; but once we have already built the factory, the product price is definitely more influential to the revenue. Hence, the implicit assumption (2) doesn't match this context. The essence of this issue is that the standard Tobit implicitly models a very strong link between the participation decision $$( y=0$$ or $$y>0)$$ and the amount decision (the magnitude of $$y$$ when $$y>0$$). If a corner solution model is represented in a general form: $$y=s \centerdot w,$$, where $$s$$  is the participate decision and  $$w$$ is the amount decision, standard Tobit model assumes:


 * $$s=1[x\beta +u>0];$$


 * $$w=x\beta+u.$$

To make the model compatible with more contexts, a natural improvement is to assume:


 * $$s=1[x \gamma +u>0], \text{ where } u \sim N (0,1);$$

$$w=x\beta + e,$$ where the error term ($$e$$) is distributed as a truncated normal distribution with a density as $$\varphi (\cdot) / \Phi \left(\frac{x\beta}\sigma \right)/\sigma;$$

$$s$$ and $$w$$ are independent conditional on $$x$$.

This is called Truncated Normal Hurdle Model, which is proposed in Cragg (1971). By adding one more parameter and detach the amount decision with the participation decision, the model can fit more contexts. Under this model setup, the density of the $$y$$ given $$x$$ can be written as:


 * $$f (y\mid x)= [1-\Phi (\chi\gamma)]^{1[y = 0]} \cdot \left[\frac{\Phi\ (\chi\gamma)}{\Phi ( \chi\beta/\sigma)} \left. \varphi \left(\frac{y-\chi\beta}{\sigma}\right) \right/ \sigma\right] ^{1[y>0]}$$

From this density representation, it is obvious that it will degenerate to the standard Tobit model when $$\gamma= \beta/\sigma.$$ This also shows that Truncated Normal Hurdle Model is more general than the standard Tobit model.

The Truncated Normal Hurdle Model is usually estimated through MLE. The log-likelihood function can be written as:



\begin{align} \ell(\beta,\gamma,\sigma) = {} & \sum_{i=1}^N 1[y_i = 0] \log [1-\Phi (x_i \gamma) ] +1 [y_i>0] \log [\Phi (x_i \gamma)] \\[5pt] & {} + 1[y_i>0] \left[ -\log \left[\Phi \left( \frac{x_i\beta} \sigma \right)\right] + \log \left(\varphi \left(\frac{y_i-x_i\beta} \sigma \right) \right) -\log (\sigma) \right] \end{align} $$

From the log-likelihood function, $$\gamma$$ can be estimated by a probit model and $$(\beta,\sigma)$$ can be estimated by a truncated normal regression model. Based on the estimates, consistent estimates for the Average Partial Effect can be estimated correspondingly.