Itakura–Saito distance

The Itakura–Saito distance (or Itakura–Saito divergence) is a measure of the difference between an original spectrum $$P(\omega)$$ and an approximation $$\hat{P}(\omega)$$ of that spectrum. Although it is not a perceptual measure, it is intended to reflect perceptual (dis)similarity. It was proposed by Fumitada Itakura and Shuzo Saito in the 1960s while they were with NTT.

The distance is defined as:


 * $$D_{IS}(P(\omega),\hat{P}(\omega))=\frac{1}{2\pi}\int_{-\pi}^{\pi} \left[ \frac{P(\omega)}{\hat{P}(\omega)}-\log \frac{P(\omega)}{\hat{P}(\omega)} - 1 \right] \, d\omega$$

The Itakura–Saito distance is a Bregman divergence generated by minus the logarithmic function, but is not a true metric since it is not symmetric and it does not fulfil triangle inequality.

In Non-negative matrix factorization, the Itakura-Saito divergence can be used as a measure of the quality of the factorization: this implies a meaningful statistical model of the components and can be solved through an iterative method.

The Itakura-Saito distance is the Bregman divergence associated with the Gamma exponential family where the information divergence of one distribution in the family from another element in the family is given by the Itakura-Saito divergence of the mean value of the first distribution from the mean value of the second distribution.