User:Jmath666/Estimation of covariance matrices

Given a sample X1,..., Xn from a random vector X &isin; Rp&times;1 (a p&times;1 column), the unbiased estimator of the covariance matrix
 * $$Cov(X) = E((X-E(X))(X-E(X))^T)$$

is the sample covariance matrix
 * $${1 \over {n-1}}\sum_{i=1}^n (X_i-\overline{X})(X_i-\overline{X})^T,$$

where
 * $$\overline{X}={1 \over {n}}\sum_{i=1}^n X_i$$

is the sample mean. This is true regardless if the random variable X has normal distribution or not. The reason for the factor n-1 rather than n is essentially that the mean is not known and is replaced by the sample mean.

The maximum likelihood estimator of the covariance matrix, however, is slightly is different. When the random variable X is normally distributed, the maximum likelihood estimate is
 * $${1 \over n}\sum_{i=1}^n (X_i-\overline{X})(X_i-\overline{X})^T.$$

Clearly, the difference between the unbiased and the maximum likelihood estimator diminishes for large n.

The probability distribution of the maximum likelihood estimator of the covariance matrix of a multivariate normal distribution is the Wishart distribution. Although no one is surprised that the estimator of the population covariance matrix is closely related to the sample covariance matrix, the mathematical derivation is perhaps not widely known and is surprisingly subtle and elegant.