Wiener–Khinchin theorem

In applied mathematics, the Wiener–Khinchin theorem or Wiener–Khintchine theorem, also known as the Wiener–Khinchin–Einstein theorem or the Khinchin–Kolmogorov theorem, states that the autocorrelation function of a wide-sense-stationary random process has a spectral decomposition given by the power spectral density of that process.

History
Norbert Wiener proved this theorem for the case of a deterministic function in 1930; Aleksandr Khinchin later formulated an analogous result for stationary stochastic processes and published that probabilistic analogue in 1934. Albert Einstein explained, without proofs, the idea in a brief two-page memo in 1914.

Continuous-time process
For continuous time, the Wiener–Khinchin theorem says that if $$ x $$ is a wide-sense-stationary random process whose autocorrelation function (sometimes called autocovariance) defined in terms of statistical expected value, $$r_{xx}(\tau) = \mathbb{E}\big[x(t)^*\cdot x(t - \tau)\big]$$ exists and is finite at every lag $$ \tau $$, then there exists a monotone function $$ F(f) $$ in the frequency domain $$ -\infty < f < \infty $$, or equivalently a non negative Radon measure $$\mu$$ on the frequency domain, such that


 * $$ r_{xx} (\tau) = \int_{-\infty}^\infty e^{2\pi i\tau f}\mu(df) = \int_{-\infty}^\infty e^{2\pi i\tau f} dF(f) , $$

where the integral is a Riemann–Stieltjes integral. The asterisk denotes complex conjugate, and can be omitted if the random process is real-valued. This is a kind of spectral decomposition of the auto-correlation function. F is called the power spectral distribution function and is a statistical distribution function. It is sometimes called the integrated spectrum.

The Fourier transform of $$x(t)$$ does not exist in general, because stochastic random functions are not generally either square-integrable or absolutely integrable. Nor is $$ r_{xx} $$ assumed to be absolutely integrable, so it need not have a Fourier transform either.

However, if the measure $$ \mu(df) = dF(f) $$ is absolutely continuous, for example, if the process is purely indeterministic, then $$ F $$ is differentiable almost everywhere and we can write $$\mu(df) = S(f) df $$. In this case, one can determine $$ S(f)$$, the power spectral density of $$x(t)$$, by taking the averaged derivative of $$ F $$. Because the left and right derivatives of $$ F $$ exist everywhere, i.e. we can put $$ S(f) = \frac12 \left(\lim_{\varepsilon \downarrow 0} \frac1\varepsilon \big(F(f + \varepsilon) - F(f)\big) + \lim_{\varepsilon \uparrow 0} \frac1\varepsilon \big(F(f + \varepsilon) - F(f)\big)\right)$$ everywhere, (obtaining that F is the integral of its averaged derivative ), and the theorem simplifies to


 * $$ r_{xx} (\tau) = \int_{-\infty}^\infty e^{2\pi i\tau f} \, S(f)df. $$

If now one assumes that r and S satisfy the necessary conditions for Fourier inversion to be valid, the Wiener–Khinchin theorem takes the simple form of saying that r and S are a Fourier-transform pair, and


 * $$ S(f) = \int_{-\infty}^\infty r_{xx} (\tau) e^{-2\pi if\tau} \,d\tau. $$

Discrete-time process
For the discrete-time case, the power spectral density of the function with discrete values $$x_n$$ is


 * $$ S(\omega)=\frac{1}{2\pi} \sum_{k=-\infty}^\infty r_{xx}(k)e^{-i \omega k} $$

where $$\omega = 2 \pi f$$ is the angular frequency, $$i$$ is used to denote the imaginary unit (in engineering, sometimes the letter $$j$$ is used instead) and $$r_{xx}(k) $$ is the discrete autocorrelation function of $$x_n$$, defined in its deterministic or stochastic formulation.

Provided $$r_{xx}$$ is absolutely summable, i.e.


 * $$ \sum_{k=-\infty}^\infty |r_{xx}(k)| < +\infty $$

the result of the theorem then can be written as


 * $$ r_{xx}(\tau) = \int_{-\pi}^{\pi} e^{i \tau \omega} S(\omega) d\omega $$

Being a discrete-time sequence, the spectral density is periodic in the frequency domain. For this reason, the domain of the function $$ S $$ is usually restricted to $$ [-\pi, \pi] $$ (note the interval is open from one side).

Application
The theorem is useful for analyzing linear time-invariant systems (LTI systems) when the inputs and outputs are not square-integrable, so their Fourier transforms do not exist. A corollary is that the Fourier transform of the autocorrelation function of the output of an LTI system is equal to the product of the Fourier transform of the autocorrelation function of the input of the system times the squared magnitude of the Fourier transform of the system impulse response. This works even when the Fourier transforms of the input and output signals do not exist because these signals are not square-integrable, so the system inputs and outputs cannot be directly related by the Fourier transform of the impulse response.

Since the Fourier transform of the autocorrelation function of a signal is the power spectrum of the signal, this corollary is equivalent to saying that the power spectrum of the output is equal to the power spectrum of the input times the energy transfer function.

This corollary is used in the parametric method for power spectrum estimation.

Discrepancies in terminology
In many textbooks and in much of the technical literature, it is tacitly assumed that Fourier inversion of the autocorrelation function and the power spectral density is valid, and the Wiener–Khinchin theorem is stated, very simply, as if it said that the Fourier transform of the autocorrelation function was equal to the power spectral density, ignoring all questions of convergence (similar to Einstein's paper ). But the theorem (as stated here) was applied by Norbert Wiener and Aleksandr Khinchin to the sample functions (signals) of wide-sense-stationary random processes, signals whose Fourier transforms do not exist. Wiener's contribution was to make sense of the spectral decomposition of the autocorrelation function of a sample function of a wide-sense-stationary random process even when the integrals for the Fourier transform and Fourier inversion do not make sense.

Further complicating the issue is that the discrete Fourier transform always exists for digital, finite-length sequences, meaning that the theorem can be blindly applied to calculate autocorrelations of numerical sequences. As mentioned earlier, the relation of this discrete sampled data to a mathematical model is often misleading, and related errors can show up as a divergence when the sequence length is modified.

Some authors refer to $$R$$ as the autocovariance function. They then proceed to normalize it by dividing by $$R(0)$$, to obtain what they refer to as the autocorrelation function.