Small area estimation

Small area estimation is any of several statistical techniques involving the estimation of parameters for small sub-populations, generally used when the sub-population of interest is included in a larger survey.

The term "small area" in this context generally refers to a small geographical area such as a county. It may also refer to a "small domain", i.e. a particular demographic within an area. If a survey has been carried out for the population as a whole (for example, a nation or statewide survey), the sample size within any particular small area may be too small to generate accurate estimates from the data. To deal with this problem, it may be possible to use additional data (such as census records) that exists for these small areas in order to obtain estimates.

One of the more common small area models in use today is the 'nested area unit level regression model', first used in 1988 to model corn and soybean crop areas in Iowa. The initial survey data, in which farmers reported the area they had growing either corn or soybeans, was compared to estimates obtained from satellite mapping of the farms. The final model resulting from this for unit/farm 'j' in county 'i' is $$y_{ij} = x_{ij}'\beta +\mu_i +\epsilon_{ij} \,$$, where 'y' denotes the reported crop area, $$\beta \,$$ is the regression coefficient, 'x' is the farm-level estimate for either corn or soybean usage from the satellite data and $$\mu \, $$ represents the county-level effect of any area characteristics unaccounted for.

The Fay-Herriot model, a random effects model, has been used to make estimates for small domains when the sample from each domain is too small for fixed effects.