G-prior

In statistics, the g-prior is an objective prior for the regression coefficients of a multiple regression. It was introduced by Arnold Zellner. It is a key tool in Bayes and empirical Bayes variable selection.

Definition
Consider a data set $$(x_1,y_1),\ldots,(x_n,y_n)$$, where the $$x_i$$ are Euclidean vectors and the $$y_i$$ are scalars. The multiple regression model is formulated as
 * $$y_i = x_i^\top\beta + \varepsilon_i.$$

where the $$\varepsilon_i$$ are random errors. Zellner's g-prior for $$\beta$$ is a multivariate normal distribution with covariance matrix proportional to the inverse Fisher information matrix for $$\beta$$, similar to a Jeffreys prior.

Assume the $$\varepsilon_i$$ are i.i.d. normal with zero mean and variance $$\psi^{-1}$$. Let $$X$$ be the matrix with $$i$$th row equal to $$x_i^\top$$. Then the g-prior for $$\beta$$ is the multivariate normal distribution with prior mean a hyperparameter $$\beta_0$$ and covariance matrix proportional to $$\psi^{-1}(X^\top X)^{-1}$$, i.e.,
 * $$\beta |\psi \sim \text{N}[\beta_0,g\psi^{-1} (X^\top X)^{-1}].$$

where g is a positive scalar parameter.

Posterior distribution of beta
The posterior distribution of $$\beta$$ is given as
 * $$\beta |\psi,x,y \sim \text{N}\Big[q\hat\beta+(1-q)\beta_0,\frac q\psi(X^\top X)^{-1}\Big].$$

where $$q=g/(1+g)$$ and
 * $$\hat\beta = (X^\top X)^{-1}X^\top y.$$

is the maximum likelihood (least squares) estimator of $$\beta$$. The vector of regression coefficients $$\beta$$ can be estimated by its posterior mean under the g-prior, i.e., as the weighted average of the maximum likelihood estimator and $$\beta_0$$,
 * $$\tilde\beta = q\hat\beta+(1-q)\beta_0.$$

Clearly, as g →∞, the posterior mean converges to the maximum likelihood estimator.

Selection of g
Estimation of g is slightly less straightforward than estimation of $$\beta$$. A variety of methods have been proposed, including Bayes and empirical Bayes estimators.