Scaled inverse chi-squared distribution

The scaled inverse chi-squared distribution $$ \psi \, \mbox{inv-} \chi^2(\nu)$$, where $$  \psi $$ is the scale parameter, equals the univariate inverse Wishart distribution $$ \mathcal{W}^{-1}(\psi,\nu)$$ with degrees of freedom $$\nu$$.

This family of scaled inverse chi-squared distributions is linked to the inverse-chi-squared distribution and to the chi-squared distribution:

If $$X \sim \psi \, \mbox{inv-} \chi^2(\nu)$$ then $$   X/\psi \sim \mbox{inv-} \chi^2(\nu) $$ as well as $$    \psi/X \sim \chi^2(\nu) $$ and $$    1/X \sim \psi^{-1}\chi^2(\nu) $$.

Instead of $$ \psi$$, the scaled inverse chi-squared distribution is however most frequently parametrized by the scale parameter $$\tau^2 = \psi/\nu$$ and the distribution $$\nu \tau^2 \, \mbox{inv-} \chi^2(\nu)$$ is denoted by $$\mbox{Scale-inv-}\chi^2(\nu, \tau^2)$$.

In terms of $$\tau^2$$ the above relations can be written as follows:

If $$X \sim \mbox{Scale-inv-}\chi^2(\nu, \tau^2)$$ then $$   \frac{X}{\nu \tau^2} \sim \mbox{inv-} \chi^2(\nu) $$ as well as $$    \frac{\nu \tau^2}{X} \sim \chi^2(\nu) $$ and $$    1/X \sim \frac{1}{\nu \tau^2}\chi^2(\nu) $$.

This family of scaled inverse chi-squared distributions is a reparametrization of the inverse-gamma distribution.

Specifically, if
 * $$X \sim \psi \, \mbox{inv-} \chi^2(\nu) = \mbox{Scale-inv-}\chi^2(\nu, \tau^2)$$  then   $$X \sim \textrm{Inv-Gamma}\left(\frac{\nu}{2}, \frac{\psi}{2}\right) =  \textrm{Inv-Gamma}\left(\frac{\nu}{2}, \frac{\nu\tau^2}{2}\right)$$

Either form may be used to represent the maximum entropy distribution for a fixed first inverse moment $$(E(1/X))$$ and first logarithmic moment $$(E(\ln(X))$$.

The scaled inverse chi-squared distribution also has a particular use in Bayesian statistics. Specifically, the scaled inverse chi-squared distribution can be used as a conjugate prior for the variance parameter of a normal distribution. The same prior in alternative parametrization is given by the inverse-gamma distribution.

Characterization
The probability density function of the scaled inverse chi-squared distribution extends over the domain $$x>0$$ and  is



f(x; \nu, \tau^2)= \frac{(\tau^2\nu/2)^{\nu/2}}{\Gamma(\nu/2)}~ \frac{\exp\left[ \frac{-\nu \tau^2}{2 x}\right]}{x^{1+\nu/2}} $$

where $$\nu$$ is the degrees of freedom parameter and $$\tau^2$$ is the scale parameter. The cumulative distribution function is


 * $$F(x; \nu, \tau^2)=

\Gamma\left(\frac{\nu}{2},\frac{\tau^2\nu}{2x}\right) \left/\Gamma\left(\frac{\nu}{2}\right)\right.$$
 * $$=Q\left(\frac{\nu}{2},\frac{\tau^2\nu}{2x}\right)$$

where $$\Gamma(a,x)$$ is the incomplete gamma function, $$\Gamma(x)$$ is the gamma function and $$Q(a,x)$$ is a regularized gamma function. The characteristic function is


 * $$\varphi(t;\nu,\tau^2)=$$
 * $$\frac{2}{\Gamma(\frac{\nu}{2})}\left(\frac{-i\tau^2\nu t}{2}\right)^{\!\!\frac{\nu}{4}}\!\!K_{\frac{\nu}{2}}\left(\sqrt{-2i\tau^2\nu t}\right) ,$$

where $$K_{\frac{\nu}{2}}(z)$$ is the modified Bessel function of the second kind.

Parameter estimation
The maximum likelihood estimate of $$\tau^2$$ is


 * $$\tau^2 = n/\sum_{i=1}^n \frac{1}{x_i}.$$

The maximum likelihood estimate of $$\frac{\nu}{2}$$ can be found using Newton's method on:


 * $$\ln\left(\frac{\nu}{2}\right) - \psi\left(\frac{\nu}{2}\right) = \frac{1}{n} \sum_{i=1}^n \ln\left(x_i\right) - \ln\left(\tau^2\right) ,$$

where $$\psi(x)$$ is the digamma function. An initial estimate can be found by taking the formula for mean and solving it for $$\nu.$$ Let $$\bar{x} = \frac{1}{n}\sum_{i=1}^n x_i$$ be the sample mean. Then an initial estimate for $$\nu$$ is given by:


 * $$\frac{\nu}{2} = \frac{\bar{x}}{\bar{x} - \tau^2}.$$

Bayesian estimation of the variance of a normal distribution
The scaled inverse chi-squared distribution has a second important application, in the Bayesian estimation of the variance of a Normal distribution.

According to Bayes' theorem, the posterior probability distribution for quantities of interest is proportional to the product of a prior distribution for the quantities and a likelihood function:
 * $$p(\sigma^2|D,I) \propto p(\sigma^2|I) \; p(D|\sigma^2)$$

where D represents the data and I represents any initial information about &sigma;2 that we may already have.

The simplest scenario arises if the mean &mu; is already known; or, alternatively, if it is the conditional distribution of &sigma;2 that is sought, for a particular assumed value of &mu;.

Then the likelihood term L(&sigma;2|D) = p(D|&sigma;2) has the familiar form
 * $$\mathcal{L}(\sigma^2|D,\mu) = \frac{1}{\left(\sqrt{2\pi}\sigma\right)^n} \; \exp \left[ -\frac{\sum_i^n(x_i-\mu)^2}{2\sigma^2} \right]$$

Combining this with the rescaling-invariant prior p(&sigma;2|I) = 1/&sigma;2, which can be argued (e.g. following Jeffreys) to be the least informative possible prior for &sigma;2 in this problem, gives a combined posterior probability
 * $$p(\sigma^2|D, I, \mu) \propto \frac{1}{\sigma^{n+2}} \; \exp \left[ -\frac{\sum_i^n(x_i-\mu)^2}{2\sigma^2} \right]$$

This form can be recognised as that of a scaled inverse chi-squared distribution, with parameters &nu; = n and &tau;2 = s2 = (1/n) Σ (xi-&mu;)2

Gelman et al remark that the re-appearance of this distribution, previously seen in a sampling context, may seem remarkable; but given the choice of prior the "result is not surprising".

In particular, the choice of a rescaling-invariant prior for &sigma;2 has the result that the probability for the ratio of &sigma;2 / s2 has the same form (independent of the conditioning variable) when conditioned on s2 as when conditioned on &sigma;2:


 * $$p(\tfrac{\sigma^2}{s^2}|s^2) = p(\tfrac{\sigma^2}{s^2}|\sigma^2)$$

In the sampling-theory case, conditioned on &sigma;2, the probability distribution for (1/s2) is a scaled inverse chi-squared distribution; and so the probability distribution for &sigma;2 conditioned on s2, given a scale-agnostic prior, is also a scaled inverse chi-squared distribution.

Use as an informative prior
If more is known about the possible values of &sigma;2, a distribution from the scaled inverse chi-squared family, such as Scale-inv-&chi;2(n0, s02) can be a convenient form to represent a more informative prior for &sigma;2, as if from the result of n0 previous observations (though n0 need not necessarily be a whole number):
 * $$p(\sigma^2|I^\prime, \mu) \propto \frac{1}{\sigma^{n_0+2}} \; \exp \left[ -\frac{n_0 s_0^2}{2\sigma^2} \right]$$

Such a prior would lead to the posterior distribution
 * $$p(\sigma^2|D, I^\prime, \mu) \propto \frac{1}{\sigma^{n+n_0+2}} \; \exp \left[ -\frac{ns^2 + n_0 s_0^2}{2\sigma^2} \right]$$

which is itself a scaled inverse chi-squared distribution. The scaled inverse chi-squared distributions are thus a convenient conjugate prior family for &sigma;2 estimation.

Estimation of variance when mean is unknown
If the mean is not known, the most uninformative prior that can be taken for it is arguably the translation-invariant prior p(&mu;|I) ∝ const., which gives the following joint posterior distribution for &mu; and &sigma;2,

\begin{align} p(\mu, \sigma^2 \mid D, I) & \propto \frac{1}{\sigma^{n+2}} \exp \left[ -\frac{\sum_i^n(x_i-\mu)^2}{2\sigma^2} \right] \\ & = \frac{1}{\sigma^{n+2}} \exp \left[ -\frac{\sum_i^n(x_i-\bar{x})^2}{2\sigma^2} \right] \exp \left[ -\frac{n(\mu -\bar{x})^2}{2\sigma^2} \right] \end{align} $$ The marginal posterior distribution for &sigma;2 is obtained from the joint posterior distribution by integrating out over &mu;,
 * $$\begin{align}

p(\sigma^2|D, I) \; \propto \; & \frac{1}{\sigma^{n+2}} \; \exp \left[ -\frac{\sum_i^n(x_i-\bar{x})^2}{2\sigma^2} \right] \; \int_{-\infty}^{\infty} \exp \left[ -\frac{n(\mu -\bar{x})^2}{2\sigma^2} \right] d\mu\\ = \; & \frac{1}{\sigma^{n+2}} \; \exp \left[ -\frac{\sum_i^n(x_i-\bar{x})^2}{2\sigma^2} \right] \; \sqrt{2 \pi \sigma^2 / n} \\ \propto \; & (\sigma^2)^{-(n+1)/2} \; \exp \left[ -\frac{(n-1)s^2}{2\sigma^2} \right] \end{align}$$ This is again a scaled inverse chi-squared distribution, with parameters $$\scriptstyle{n-1}\;$$ and $$\scriptstyle{s^2 = \sum (x_i - \bar{x})^2/(n-1)}$$.

Related distributions

 * If $$X \sim \mbox{Scale-inv-}\chi^2(\nu, \tau^2)$$ then $$ k X \sim \mbox{Scale-inv-}\chi^2(\nu, k \tau^2)\, $$
 * If $$X \sim \mbox{inv-}\chi^2(\nu) \, $$ (Inverse-chi-squared distribution) then $$X \sim \mbox{Scale-inv-}\chi^2(\nu, 1/\nu) \,$$
 * If $$X \sim \mbox{Scale-inv-}\chi^2(\nu, \tau^2)$$ then $$ \frac{X}{\tau^2 \nu} \sim \mbox{inv-}\chi^2(\nu) \, $$ (Inverse-chi-squared distribution)
 * If $$X \sim \mbox{Scale-inv-}\chi^2(\nu, \tau^2)$$ then $$X \sim \textrm{Inv-Gamma}\left(\frac{\nu}{2}, \frac{\nu\tau^2}{2}\right)$$ (Inverse-gamma distribution)
 * Scaled inverse chi square distribution is a special case of type 5 Pearson distribution