Hodges' estimator

In statistics, Hodges' estimator (or the Hodges–Le Cam estimator ), named for Joseph Hodges, is a famous counterexample of an estimator which is "superefficient", i.e. it attains smaller asymptotic variance than regular efficient estimators. The existence of such a counterexample is the reason for the introduction of the notion of regular estimators.

Hodges' estimator improves upon a regular estimator at a single point. In general, any superefficient estimator may surpass a regular estimator at most on a set of Lebesgue measure zero.

Although Hodges discovered the estimator he never published it; the first publication was in the doctoral thesis of Lucien Le Cam.

Construction
Suppose $$\hat{\theta}_n$$ is a "common" estimator for some parameter $$\theta$$: it is consistent, and converges to some asymptotic distribution $$L_\theta$$ (usually this is a normal distribution with mean zero and variance which may depend on $$\theta$$) at the $$\sqrt{n}$$-rate:

\sqrt{n}(\hat\theta_n - \theta)\ \xrightarrow{d}\ L_\theta\. $$

Then the Hodges' estimator $$\hat{\theta}_n^H$$ is defined as

\hat\theta_n^H = \begin{cases}\hat\theta_n, & \text{if } |\hat\theta_n| \geq n^{-1/4}, \text{ and} \\ 0, & \text{if } |\hat\theta_n| < n^{-1/4}.\end{cases} $$ This estimator is equal to $$\hat{\theta}_n$$ everywhere except on the small interval $$[-n^{-1/4},n^{-1/4}]$$, where it is equal to zero. It is not difficult to see that this estimator is consistent for $$\theta$$, and its asymptotic distribution is
 * $$\begin{align}

& n^\alpha(\hat\theta_n^H - \theta) \ \xrightarrow{d}\ 0, \qquad\text{when } \theta = 0, \\ &\sqrt{n}(\hat\theta_n^H - \theta)\ \xrightarrow{d}\ L_\theta, \quad \text{when } \theta\neq 0, \end{align}$$ for any $$\alpha\in\mathbb{R}$$. Thus this estimator has the same asymptotic distribution as $$\hat{\theta}_n$$ for all $$\theta\neq 0$$, whereas for $$\theta=0$$ the rate of convergence becomes arbitrarily fast. This estimator is superefficient, as it surpasses the asymptotic behavior of the efficient estimator $$\hat{\theta}_n$$ at least at one point $$\theta=0$$.

It is not true that the Hodges estimator is equivalent to the sample mean, but much better when the true mean is 0. The correct interpretation is that, for finite $$n$$, the truncation can lead to worse square error than the sample mean  estimator for $$E[X]$$ close to 0, as is shown in the example in the following section.

Le Cam shows that this behaviour is typical: superefficiency at the point θ implies the existence of a sequence $$\theta_n \rightarrow \theta$$ such that $$\lim \inf E \theta_n \ell (\sqrt n (\hat \theta_n - \theta_n ))$$ is strictly larger than the Cramér-Rao bound. For the extreme case where the asymptotic risk at θ is zero, the $$\liminf$$ is even infinite for a sequence $$\theta_n \rightarrow \theta$$.

In general, superefficiency may only be attained on a subset of Lebesgue measure zero of the parameter space $$\Theta$$.

Example


Suppose x1, ..., xn is an independent and identically distributed (IID) random sample from normal distribution N(θ, 1) with unknown mean but known variance. Then the common estimator for the population mean θ is the arithmetic mean of all observations:. The corresponding Hodges' estimator will be, where 1{...} denotes the indicator function.

The mean square error (scaled by n) associated with the regular estimator x is constant and equal to 1 for all θ's. At the same time the mean square error of the Hodges' estimator behaves erratically in the vicinity of zero, and even becomes unbounded as n → ∞. This demonstrates that the Hodges' estimator is not regular, and its asymptotic properties are not adequately described by limits of the form (θ fixed, n → ∞).