Convergent cross mapping

Convergent cross mapping (CCM) is a statistical test for a cause-and-effect relationship between two variables that, like the Granger causality test, seeks to resolve the problem that correlation does not imply causation. While Granger causality is best suited for purely stochastic systems where the influences of the causal variables are separable (independent of each other), CCM is based on the theory of dynamical systems and can be applied to systems where causal variables have synergistic effects. As such, CCM is specifically aimed to identify linkage between variables that can appear uncorrelated with each other.

Theory
In the event one has access to system variables as time series observations, Takens' embedding theorem can be applied. Takens' theorem generically proves that the state space of a dynamical system can be reconstructed from a single observed time series of the system, $$X$$. This reconstructed or shadow manifold $$M_X$$ is diffeomorphic to the true manifold, $$M$$, preserving instrinsic state space properties of $$M$$ in $$M_X$$.

Convergent Cross Mapping (CCM) leverages a corollary to the Generalized Takens Theorem that it should be possible to cross predict or cross map between variables observed from the same system. Suppose that in some dynamical system involving variables $$X$$ and $$Y$$, $$X$$ causes $$Y$$. Since $$X$$ and $$Y$$ belong to the same dynamical system, their reconstructions via embeddings $$M_{X}$$ and $$M_{Y}$$, also map to the same system.

The causal variable $$X$$ leaves a signature on the affected variable $$Y$$, and consequently, the reconstructed states based on $$Y$$ can be used to cross predict values of $$X$$. CCM leverages this property to infer causality by predicting $$X$$ using the $$M_{Y}$$ library of points (or vice-versa for the other direction of causality), while assessing improvements in cross map predictability as larger and larger random samplings of $$M_{Y}$$ are used. If the prediction skill of $$X$$ increases and saturates as the entire $$M_{Y}$$ is used, this provides evidence that $$X$$ is causally influencing $$Y$$.

Cross mapping is generally asymmetric. If $$X$$ forces $$Y$$ unidirectionally, variable $$Y$$ will contain information about $$X$$, but not vice versa. Consequently, the state of $$X$$ can be predicted from $$M_Y$$, but $$Y$$ will not be predictable from $$M_X$$.

Algorithm
The basic steps of convergent cross mapping for a variable $$X$$ of length $$N$$ against variable $$Y$$ are:


 * 1) If needed, create the state space manifold $$M_Y$$ from $$Y$$
 * 2) Define a sequence of library subset sizes $$L$$ ranging from a small fraction of $$N$$ to close to $$N$$.
 * 3) Define a number of ensembles $$N_E$$ to evaluate at each library size.
 * 4) At each library subset size $$L_i$$:
 * 5) For $$N_E$$ ensembles:
 * 6) Randomly select $$L_i$$ state space vectors from $$M_Y$$
 * 7) Estimate $$\hat{X}$$ from the random subset of $$M_Y$$ using the Simplex state space prediction
 * 8) Compute the correlation $$\rho$$ between $$\hat{X}$$ and $$X$$
 * 9) Compute the mean correlation $$\bar{\rho}$$ over the $$N_E$$ ensembles at $$L_i$$
 * 10) The spectrum of $$\bar{\rho}$$ versus $$L$$ must exhibit convergence.
 * 11) Assess significance. One technique is to compare $$\bar{\rho}$$ to $$\bar{\rho_S}$$ computed from $$S$$ random realizations (surrogates) of $$X$$.

Applications
CCM is used to detect if two variables belong to the same dynamical system, for example, can past ocean surface temperatures be estimated from the population data over time of sardines or if there is a causal relationship between cosmic rays and global temperatures. As for the latter it was hypothesised that cosmic rays may impact cloud formation, therefore cloudiness, therefore global temperatures.

Extensions
Extensions to CCM include:
 * Extended Convergent Cross Mapping
 * Convergent Cross Sorting