Directional component analysis

Directional component analysis (DCA)   is a statistical method used in climate science for identifying representative patterns of variability in space-time data-sets such as historical climate observations, weather prediction ensembles or climate ensembles.

The first DCA pattern is a pattern of weather or climate variability that is both likely to occur (measured using likelihood) and has a large impact (for a specified linear impact function, and given certain mathematical conditions: see below).

The first DCA pattern contrasts with the first PCA pattern, which is likely to occur, but may not have a large impact, and with a pattern derived from the gradient of the impact function, which has a large impact, but may not be likely to occur.

DCA differs from other pattern identification methods used in climate research, such as EOFs, rotated EOFs and extended EOFs in that it takes into account an external vector, the gradient of the impact.

DCA provides a way to reduce large ensembles from weather forecasts or climate models to just two patterns. The first pattern is the ensemble mean, and the second pattern is the DCA pattern, which represents variability around the ensemble mean in a way that takes impact into account. DCA contrasts with other methods that have been proposed for the reduction of ensembles in that it takes impact into account in addition to the structure of the ensemble.

Inputs
DCA is calculated from two inputs:
 * a multivariate dataset of weather or climate data, such as historical climate observations, or a weather or climate ensemble
 * a linear impact function. The linear impact function is a function which defines a level of impact for every spatial pattern in the weather or climate data as a weighted sum of the values at different locations in the spatial pattern. An example is the mean value across the spatial pattern. The linear impact function can be generated as the first term in the multivariate Taylor series of a non-linear impact function.

Formula
Consider a space-time data set $$X$$, containing individual spatial pattern vectors $$x$$, where the individual patterns are each considered as single samples from a multivariate normal distribution with mean zero and covariance matrix $$C$$.

We define a linear impact function of a spatial pattern as $$r^tx$$, where $$r$$ is a vector of spatial weights.

The first DCA pattern is given in terms the covariance matrix $$C$$ and the weights $$r$$ by the proportional expression $$x \propto Cr$$.

The pattern can then be normalized to any length as required.

Properties
If the weather or climate data is elliptically distributed (e.g., is distributed as a multivariate normal distribution or a multivariate t-distribution) then the first DCA pattern (DCA1) is defined as the spatial pattern with the following mathematical properties:
 * DCA1 maximises probability density for a given value of impact
 * DCA1 maximises impact for a given value of probability density
 * DCA1 maximises the product of impact and probability density
 * DCA1 is the conditional expectation, conditional on exceeding a certain level of impact
 * DCA1 is the impact-weighted ensemble mean
 * Any modification of DCA1 will lead to a pattern that is either less extreme, or has a lower probability density.

Rainfall Example
For instance, in a rainfall anomaly dataset, using an impact metric defined as the total rainfall anomaly, the first DCA pattern is the spatial pattern that has the highest probability density for a given total rainfall anomaly. If the given total rainfall anomaly is chosen to have a large value, then this pattern combines being extreme in terms of the metric (i.e., representing large amounts of total rainfall) with being likely in terms of the pattern, and so is well suited as a representative extreme pattern.

Comparison with PCA
The main differences between Principal component analysis (PCA) and DCA are
 * PCA is a function of just the covariance matrix, and the first PCA pattern is defined so as to maximise explained variance
 * DCA is a function of the covariance matrix and a vector direction (the gradient of the impact function), and the first DCA pattern is defined so as to maximise probability density for a given value of the impact metric

As a result, for unit vector spatial patterns:
 * The first PCA spatial pattern always corresponds to a higher explained variance, but has a lower value of the impact metric (e.g., the total rainfall anomaly), except in degenerate cases
 * The first DCA spatial pattern always corresponds to a higher value of the impact metric, but has a lower value of the explained variance, except in degenerate cases

The degenerate cases occur when the PCA and DCA patterns are equal.

Also, given the first PCA pattern, the DCA pattern can be scaled so that:
 * The scaled DCA pattern has the same probability density as the first PCA pattern, but higher impact, or
 * The scaled DCA pattern has the same impact as the first PCA pattern, but higher probability density.

Two Dimensional Example
Source:



Figure 1 gives an example, which can be understood as follows:
 * The two axes represent anomalies of annual mean rainfall at two locations, with the highest total rainfall anomaly values towards the top right corner of the diagram
 * The joint variability of the rainfall anomalies at the two locations is assumed to follow a bivariate normal distribution
 * The ellipse shows a single contour of probability density from this bivariate normal, with higher values inside the ellipse
 * The red dot at the centre of the ellipse shows zero rainfall anomalies at both locations
 * The blue parallel-line arrow shows the principal axis of the ellipse, which is also the first PCA spatial pattern vector
 * In this case, the PCA pattern is scaled so that it touches the ellipse
 * The diagonal straight line shows a line of constant positive total rainfall anomaly, assumed to be at some fairly extreme level
 * The red dotted-line arrow shows the first DCA pattern, which points towards the point at which the diagonal line is tangent to the ellipse
 * In this case, the DCA pattern is scaled so that it touches the ellipse

From this diagram, the DCA pattern can be seen to possess the following properties:
 * Of all the points on the diagonal line, it is the one with the highest probability density
 * Of all the points on the ellipse, it is the one with the highest total rainfall anomaly
 * It has the same probability density as the PCA pattern, but represents higher total rainfall (i.e., points further towards the top right hand corner of the diagram)
 * Any change of the DCA pattern will reduce either the probability density (if it moves out of the ellipse) or reduce the total rainfall anomaly (if it moves along or into the ellipse)

In this case the total rainfall anomaly of the PCA pattern is quite small, because of anticorrelations between the rainfall anomalies at the two locations. As a result, the first PCA pattern is not a good representative example of a pattern with large total rainfall anomaly, while the first DCA pattern is.

In $$n$$ dimensions the ellipse becomes an ellipsoid, the diagonal line becomes an $$n-1$$ dimensional plane, and the PCA and DCA patterns are vectors in $$n$$ dimensions.

Application to Climate Variability
DCA has been applied to the CRU data-set of historical rainfall variability in order to understand the most likely patterns of rainfall extremes in the US and China.

Application to Ensemble Weather Forecasts
DCA has been applied to ECMWF medium-range weather forecast ensembles in order to identify the most likely patterns of extreme temperatures in the ensemble forecast.

Application to Ensemble Climate Model Projections
DCA has been applied to ensemble climate model projections in order to identify the most likely patterns of extreme future rainfall.

Derivation of the First DCA Pattern
Source:

Consider a space-time data-set $$X$$, containing individual spatial pattern vectors $$x$$, where the individual patterns are each considered as single samples from a multivariate normal distribution with mean zero and covariance matrix $$C$$.

As a function of $$x$$, the log probability density is proportional to $$-x^t C^{-1} x$$.

We define a linear impact function of a spatial pattern as $$r^tx$$, where $$r$$ is a vector of spatial weights.

We then seek to find the spatial pattern that maximises the probability density for a given value of the linear impact function. This is equivalent to finding the spatial pattern that maximises the log probability density for a given value of the linear impact function, which is slightly easier to solve.

This is a constrained maximisation problem, and can be solved using the method of Lagrange multipliers.

The Lagrangian function is given by

$$L(x,\lambda)=-x^t C^{-1}x-\lambda(r^tx-1)$$

Differentiating by $$x$$ and setting to zero gives the solution

$$x \propto Cr$$

Normalising so that $$x$$ is unit vector gives

$$x = Cr / (r^tCCr)^{1/2}$$

This is the first DCA pattern.

Subsequent patterns can be derived which are orthogonal to the first, to form an orthonormal set and a method for matrix factorisation.