Unobserved heterogeneity in duration models

Issues of heterogeneity in duration models can take on different forms. On the one hand, unobserved heterogeneity can play a crucial role when it comes to different sampling methods, such as stock or flow sampling. On the other hand, duration models have also been extended to allow for different subpopulations, with a strong link to mixture models. Many of these models impose the assumptions that heterogeneity is independent of the observed covariates, it has a distribution that depends on a finite number of parameters only, and it enters the hazard function multiplicatively.

One can define the conditional hazard as the hazard function conditional on the observed covariates and the unobserved heterogeneity. In the general case, the cumulative distribution function of ti* associated with the conditional hazard is given by F(t|xi, vi ; θ). Under the first assumption above, the unobserved component can be integrated out and we obtain the cumulative distribution on the observed covariates only, i.e.

G(t ∨ xi ; θ, ρ) = ∫ F (t ∨ xi, ν ; θ ) h ( ν ; ρ ) dν

where the additional parameter ρ parameterizes the density of the unobserved component v. Now, the different estimation methods for stock or flow sampling data are available to estimate the relevant parameters.

A specific example is described by Lancaster. Assume that the conditional hazard is given by

λ(t ; xi, vi ) = vi exp (x β) α t α-1

where x is a vector of observed characteristics, v is the unobserved heterogeneity part, and a normalization (often E[vi] = 1) needs to be imposed. It then follows that the average hazard is given by exp(x'β) αtα-1. More generally, it can be shown that as long as the hazard function exhibits proportional properties of the form λ ( t ; xi, vi ) = vi κ (xi ) λ0 (t), one can identify both the covariate function κ(.) and the hazard function λ(.).

Recent examples provide a nonparametric approaches to estimating the baseline hazard and the distribution of the unobserved heterogeneity under fairly weak assumptions. In grouped data, the strict exogeneity assumptions for time-varying covariates are hard to relax. Parametric forms can be imposed for the distribution of the unobserved heterogeneity, even though semiparametric methods that do not specify such parametric forms for the unobserved heterogeneity are available.