User:Maraun/Sandbox

Continuous wavelet spectral analysis is a statistical framework of time series analysis based on the continuous wavelet transform to analyse the variability of an underlying nonstationary process with respect to time and scale (i.e. periods). It can be interpreted as a time dependent generalization of the time independent Fourier spectral analysis of stationary processes and an optimization of the short-time Fourier analysis based on the Gabor transformation.

Continuous Wavelet Transformation
The continuous wavelet transformation $$W_g(b,a)$$ at time $$b$$ and scale $$a$$ of a time series $$s(t)$$ using the wavelet $$g(t)$$ (sometimes referred to as mother wavelet) is given as the convolution of the series with the dilated and shifted wavelet:


 * $$W_gs(t)[b,a]=\int dt \, \frac{1}{\sqrt{a}} \, \bar{g}\Bigg( \frac{t-b}{a} \Bigg) s(t)$$

Covariances
The covariances of the wavelet transformation $$W_gs[t](b,a)$$ of a signal $$s(t)$$ with respect to the wavelet $$g(t)$$ corresponding to a translation $$t-b'$$ and dilation $$t/a'$$ are given as follows:


 * $$W_gs[t-b'](b,a)=W_gs[t](b-b',a)$$


 * $$W_gs[t/a'](b,a)=W_gs[t](b/a',a/a')$$

Here, the brackets $$[.]$$ denote dependencies of a variable, whereas $$(.)$$ denote dependencies of the resulting transformation. The invariant sets of the wavelet transformation are thus stripes of constant scale for translation and cones with edge in the origin for dilation.

Choice of the wavelet
As one is interested in a time and scale (i.e. 1/frequency) resolved analysis, one should apply wavelets that are sufficiently localized in time and scale. This holds, for instance, for the Morlet or the Cauchy wavelet, but not for the Haar wavelet. Because the latter one is very well localized in the time domain, it oscillates over a wide range of frequencies and hence will not provide a good frequency resolution. Furthermore, as one is interested in variances (or power) of an oscillation, one needs to calculate the oscillations amplitude (the power is the squared amplitude). Thus, the chosen wavelet needs to be complex to allow for the calculation of the amplitude as the modulus of the real and the imaginary part. Consequently, for instance, the real Mexican hat wavelet cannot be applied for continuous wavelet spectral analysis.

Inverse Transformation
By use of a reconstruction wavelet $$h(t)$$ (which is related to the wavelet $$g(t)$$ by a simple relation), one can reconstruct the time series $$s(t)$$ from its wavelet transformation $$r(b,a)=W_gs[t](b,a)$$:


 * $$M_hr(b,a)[t]=\int\limits_{H} \frac{db \,da}{a^{2}} \, r(b,a) \, \frac{1}{\sqrt{a}} \, h\Bigg( \frac{t-b}{a} \Bigg)$$

Reproducing Kernel
Since the wavelet transformation maps a one dimensional signal onto the two dimensional wavelet space, it produces redundancies. This leads to two important consequences:


 * For an arbitrary function $$r(b,a)$$ in the wavelet domain, the inverse transformation does not produce a time series, that has exactly the wavelet transformation $$r(b,a)$$, but its projection onto the subspace of all wavelet transformations.
 * The redundancies are equivalent to intrinsic correlations of any wavelet transformation (in fact: any time frequency analysis). They constitute an uncertainty relation and are given by the reproducing kernel of the used wavelet. The reproducing kernel is given as the wavelet transform of the reconstruction wavelet: $$K_{g,h}\!\left((b-b')/a',a/a'\right)=W_gh((b-b')/a')$$

Continuous Wavelet Synthesis
Before talking about wavelet spectral analysis, one has to define the measures to be analysed, i.e. one has to define the direct problem of wavelet synthesis.

A wide class of nonstationary Gaussian processes can be characterized by wavelet multipliers $$m(b,a)$$, functions in the wavelet domain. Realizations of such processes can be simulated by


 * $$s(t)=M_h m(b,a) W_g \eta(t)$$

where $$\eta(t)$$ is Gaussian white noise.

As $$m(b,a)$$ describes the time evolution of the variability on scales $$a$$, it should vary slowly compared to the oscillations of the process on the corresponding scale:


 * $$|\partial_a m(b,a)| < O(a)$$


 * $$|\partial_b m(b,a)| < O(a)$$

Based on this concept, one can define the following spectral measures, which are transferred from Fourier analysis.

Wavelet Spectrum
Given a process defined by $$m(b,a)$$, one can define a wavelet spectrum


 * $$S(b,a)=|m(b,a)|^2$$,

that describes the variance of the process at time $$b$$ on scale $$a$$.

Wavelet Cross Spectrum
Given two processes, which share covarying parts $$m_{1c}(a,b)$$ and $$m_{2c}$$, one can define a cross spectrum:


 * $$CS(b,a)=m_{1c}(b,a)m_{2c}(b,a)^*$$

The cross spectrum describes the covariance of the processes at time $$b$$ on scale $$a$$. This quantity is in general complex and can be decomposed into amplitude and phase:


 * $$CS(b,a)=|CS(b,a)| \exp(i \arg(CS(b,a)))$$

The single spectra can be a superposition of the covarying parts and independently varying parts $$m_{1i}(b,a)$$ and $$m_{2i}(b,a)$$. Realizations of such two processes are generated by the same driving noise for the covarying parts and independent noise for the independent parts:


 * $$s_1(t)=M_hm_{1c}[b,a]W_g\eta_c(t)$$ and


 * $$s_2(t)=M_hm_2[b,a]W_g\eta_c(t)$$

Wavelet Coherence
The normalization of the modulus of the wavelet cross spectrum yields the squared wavelet coherence or coherency, a measure for the linear relationship of the two processes at time $$b$$ on a scale $$a$$:


 * $$COH^2(b,a)=\frac{|CS(b,a)|^2}{S_1(b,a)S_2(b,a)}=\frac{|m_{1c}(b,a)\bar{m}_{2c}(b,a)|^2}{|m_1(b,a)|^2\,|m_2(b,a)|^2},$$

with $$m_1(b,a)=m_{1c}(b,a)+m_{1i}(b,a)$$ and $$m_2(b,a)=m_{2c}(b,a)+m_{2i}(b,a)$$. Values of $$COH(b,a)$$ range from zero for uncorrelated processes two 1 for a perfect linear relationship.

Wavelet Spectral Analysis
Given the realization of a nonstationary Gaussian process, one can estimate the above defined measures to characterize the process variability. This, together with the significance testing against a background spectrum, is referred to as an inverse problem.

Wavelet Sample Spectrum
Given a realization $$s(t)$$ of a nonstationary process, one can estimate its spectrum (i.e. calculate the wavelet sample spectrum) using a wavelet $$g(t)$$ by


 * $$\hat{S}(b,a)=A(|W_gs(t)|^2),$$

where $$A(.)$$ denotes an averaging operator. Following the terminology of Fourier analysis, the wavelet sample spectrum without averaging is either called scalogram or wavelet periodogram. Estimators of a process property are marked by a hat ($$\widehat{\,.\,}$$).

Wavelet Sample Cross Spectrum
Given realizations $$s_1(t)$$ and $$s_2(t)$$ of two processes, the cross spectrum can be estimated as


 * $$\widehat{CS}(b,a)=A(W_gs_1(t)\bar{W}_gs_2(t))$$,

or decomposed into amplitude and phase,


 * $$\widehat{CS}(b,a)=|\widehat{CS}(b,a)|\exp(i \arg(\widehat{CS}(b,a)))$$.

Wavelet Sample Coherence
The squared coherence is estimated as


 * $$\widehat{COH}(b,a)=\frac{|\widehat{CS}(b,a)|^2}{\hat{S}_1(b,a)\hat{S}_2(b,a)}$$.

For coherence, averaging is essential. Otherwise, one investigates power in a single point in time and scale, i.e. one attempts to infer covarying oscillations without observing the oscillations over a certain interval. Consequently, nominator and denominator become equal and one obtains a trivial value of one for any two processes.

Distribution of the Wavelet sample spectrum
$$|W_gs(t)|^2$$ and also the wavelet cross scalogram $$W_gs_1(t)\bar{W}_gs_2(t)$$ obey a $$\chi^2$$-distribution with two degrees of freedom. The variance equals two times the corresponding mean, $$Var=2\langle|W_gs(t)|^2\rangle$$ and produces a lot of fluctuations in the estimate. Reducing the variance by kernel averaging with the operator $$A(.)$$ produces a more complicated distribution that is not accessible analytically. This occurs due to the inherent correlations given by the reproducing kernel.

Variance of the Wavelet Sample Spectrum
As discussed in the previous section, the wavelet scalogram has got a high variance rendering the interpretation quite difficult (is some structure due to instationarities or only due to fluctuations of the estimate?). Thus, one needs to smooth the estimator. This should be done in a way, that preserves a scale independent variance and is accomplished by averaging the same amount of independent information on every scale, i.e. by choosing the length of the averaging kernel according to the reproducing kernel. This means:


 * Averaging in scale direction should be done with a window exhibiting constant length for logarithmic scales.
 * Averaging in time direction should be done with a window exhibiting a length proportional to scale.

Bias of the Wavelet Sample Spectrum
Given realizations of a Gaussian process defined by $$m(b,a)$$ and constructed with the wavelet pair $$g(t)$$ and $$h(t)$$, one can estimate the wavelet sample spectrum using a wavelet $$k(t)$$ and an averaging operator $$A(.)$$. The bias at scale $$a$$ and time $$b$$ of the wavelet sample spectrum then reads


 * $$Bias(\hat{S}(b,a))=\langle\, A(\,|W_kM_h\,m(b,a)\,W_g\eta(t)\,|^2\,)\,\rangle-\,|m(b,a)|^2,$$

The bias consists of two contributions: The averaging with the operator $$A(.)$$ produces an averaging bias of the smoothed wavelet sample spectrum in comparison to the wavelet periodogram. Furthermore, not even the wavelet periodogram is an unbiased estimator: The uncertainty relation due to internal correlations results in an intrinsic bias of the wavelet periodogram in relation to the underlying spectrum $$m(b,a)$$. Both, the averaging bias and the inherent bias cause that for finite scales the wavelet sample spectrum is not a consistent estimator even in the limit of an infinite number of realizations.

Global Wavelet Spectrum
Given a time series $$s(t)$$ of a stationary process with a Fourier periodogram $$|f(\omega)|^2=|\hat{s}(\omega)|^2$$, the wavelet transformation of $$s(t)$$ in the Fourier domain reads:


 * $$W_g(b,a)=\frac{\sqrt{a}}{2\pi}\int_{-\infty}^{\infty} \bar{\hat{g}}(a\omega)e^{ib\omega}f(\omega) d\omega$$

Here, $$\hat{.}$$ denotes the Fourier transform. The essential message of this equation is the following: A (sample) spectrum $$|f(\omega)|^2$$ with a well localized peak (e.g. the spectrum of a sine function) always gets smeared out to the width of the wavelet $$\bar{\hat{g}}(a\omega)$$ in Fourier domain. The better the wavelet is localized in the frequency domain, the lower is the broadening effect.

The global wavelet sample spectrum $$\hat{G}_g(a)$$ of a signal $$s(t)$$ with respect to the wavelet $$g(t)$$ is defined as the integral of the wavelet sample spectrum $$\hat{S}(b,a)$$ over time


 * $$\hat{G}_g(a)=\int_{-\infty}^{\infty}\hat{S}(b,a) db$$

Following the first equation in this paragraph, the global wavelet spectrum always smears out peaks to the width of the wavelet in frequency domain on the particular scale, $$\hat{g}(a\omega)$$. Thus, the global wavelet spectrum is in general not an unbiased estimator of the Fourier spectrum.

Significance Testing
Given a wavelet sample spectrum, one is often interested, if the estimate is compatible with some trivial background noise, or if it significantly deviates and thus represents some interesting features. This desire leads to the concept of statistical significance testing. In addition to the type I and type II errors, the quality of a significance test can be measured by its sensitivity and specificity. A test is sensitive, if it has got the capability of rejecting the Null hypothesis in case it is wrong. A test is specific, if it only rejects the Null hypothesis in case it is wrong.

Pointwise
A first step in the significance testing procedure is to compare every point of the estimated spectrum to a trivial background spectrum. This is done by a bootstrap-algorithm:


 * 1) Choose a significance level $$\alpha$$.
 * 2) Choose a reasonable model (e.g. an AR[1] process in case of climate data as null hypothesis $$H_0$$ and fit it to the data.
 * 3) Estimate the $$(1-\alpha)$$-quantile $$S_{\mathit{crit}}$$ (i.e. the critical value) of the corresponding background spectrum by Monte Carlo simulations.  Depending on the chosen background model and the chosen normalization of the spectral estimator, the critical value in general depends on scale.
 * 4) Check for every point in the wavelet domain, whether the estimated spectrum exceeds the corresponding critical value.

The set of all pointwise significant wavelet spectral coefficients is given as


 * $$P_{pw}=\{\,(b,a)\,|\,\hat{S}(b,a)>S_{\mathit{crit}}\,\}$$

This pointwise test, however, is very unspecific due to the effect of multiple testing and the intrinsic correlations. Assume the following case: The wavelet scalogram of a realization of a Gaussian white noise process is calculated and tested against a red noise background spectrum on the 5% level. By construction, the Null hypothesis is always true (white noise is a special case of red noise) and no significant results should appear. However, due to multiple testing, on average 5% of the wavelet plot (areas on lower scales count more than on higher scales) appear - this is a type I error - as spuriously significant. Because of the intrinsic correlations given by the reproducing kernel, these results occur as contiguous patches mimicing structure which is just an artefact of the time/frequency analysis.

Areawise
To test, if the results from the pointwise significance test remain significant when considering the internal correlations, an areawise significance test has been developed. This test examines if a pointwise significant patch is large compared to the reproducing kernel and thus unlikely to be an artefact of intrinsic correlations.

Given the set of all patches with pointwise significant values, $$P_{pw}$$, areawise significant patches are defined in the following way: For every $$(a,b)$$, a critical area $$P_{\mathit{crit}}(b,a)$$ is chosen. It is given as the subset of the time/scale domain, where the reproducing kernel, dilated and translated to $$(b,a)$$, exceeds the threshold of a certain critical level $$K_{\mathit{crit}}$$:


 * $$P_{\mathit{crit}}(b,a)=\{\,(b',a')\,|\,(K(b,a;b',a')>K_{\mathit{crit}}\,\}$$

Then the subset of additionally areawise significant wavelet spectral coefficients is given as the union of all critical areas that completely lie inside the patches of pointwise significant values:


 * $$P_{\mathit{aw}}\,=\,\bigcup_{P_{\mathit{crit}}(b,a)\,\subset\,P_{pw}}P_{\mathit{crit}}(b,a)$$.

In other words: Given a patch of pointwise significant values, a point inside this patch is areawise significant, if any reproducing kernel (dilated according to the investigated scale) containing this point totally fits into the patch. Consequently, small as well as long but thin patches are sorted out, because they are indistinguishable from background noise.

The larger the critical area, the larger a patch needs to be to be detected by the test, i.e. the critical area is related to the significance level $$\alpha_{aw}$$ of the areawise test.

The areas $$A_{pw}$$ and $$A_{aw}$$ corresponding to the sets $$P_{pw}$$ and $$P_{aw}$$ are given as


 * $$A_{pw}=\int_{P_{pw}}\frac{db \,da}{a^2}$$


 * $$A_{aw}=\int_{P_{aw}}\frac{db \,da}{a^2}$$

(Note, that on every scale $$a$$, the area is related to the corresponding measure $$a^2$$, i.e. patches on low scales are weighted higher)

The significance level of the areawise test is then defined as


 * $$\alpha_{aw}=\langle \frac{A_{aw}}{A_{pw}} \rangle$$

i.e. one minus the average ratio between the areas of areawise significant patches and pointwise significant patches. It is calculated by stochastic approximation and is almost independent of the "redness" of the background spectrum.

Significance Testing the Wavelet Cross Spectrum
Such as for the stationary Fourier cross spectrum and the covariance (its time domain counterpart), no significance test for the wavelet cross spectrum exists. Assume two processes exhibiting independent power at overlapping time and scale intervals. This power does not covary, i.e. information about one of the processes is not capable of predicting the other one. Hence, the real wavelet cross spectrum is zero. By contrast, the estimated wavelet cross spectrum always differs from zero. As it is not a normalized measure, it is impossible to decide whether a cross spectral coefficient is large because the one or the other process exhibits strong power or if actually covarying power does exist. To overcome this problem, one normalizes the cross spectrum and tests against zero coherence.

Significance Testing Wavelet Coherence
Also for the wavelet coherence, a pointwise and an areawise significance test can be performed to test for coherent oscillations on a certain scale for a certain time interval. The procedure is similar to that for the wavelet sample spectrum. The main difference is that the background coherence spectrum is almost independent from the processes under investigation (because it is normalized to the single spectra).

However, coherent oscillations do not necessarily imply coherence in a the sense of a physical coupling between the processes. Processes oscillating on similar frequencies trivially exhibit patches indicating an intermittently similar phase evolution. The lengths of these patches are given by the decorrelation times of the single processes and the similarity of the concerned frequencies.

If one is not only interested in deriving significant common oscillations, but also significant coherence in the sense of coupling between the processes, the areawise test has to be succeeded by another step: Using a bootstrapping approach individual for every setting it has to be tested, if the time interval of the common oscillations is significantly long compared to typical randomly common oscillations of independent processes.