User:Sergeyf/sandbox

Setting
Suppose θ for is an unknown parameter vector of length $$m$$, and let y be a vector of observations of θ of length $$m$$, such that the observations are normally distributed:

{\mathbf y} \sim N({\boldsymbol \theta}, \sigma^2 I).\, $$

We are interested in obtaining an estimate $$\widehat{\boldsymbol \theta} = \widehat{\boldsymbol \theta}({\mathbf y})$$ of θ, based on a single observation vector y.

This is an everyday situation in which a set of parameters is measured, and the measurements are corrupted by independent Gaussian noise. Since the noise has zero mean, it is very reasonable to use the measurements themselves as an estimate of the parameters. This is the approach of the least squares estimator, which is $$\widehat{\boldsymbol \theta}_{LS} = {\mathbf y}$$.

As a result, there was considerable shock and disbelief when Stein demonstrated that, in terms of mean squared error $$E \{ \| {\boldsymbol \theta}-\widehat {\boldsymbol \theta} \|^2 \}$$, this approach is suboptimal. The result became known as Stein's phenomenon.