Global coordination level

Global coordination level (GCL) is a computational method that evaluates the system-wide dependency in multivariate data, by calculating the distance correlation between random subsets of the variables. Originally applied to gene expression data, GCL assesses the level of coordination between genes, which are fundamentally linked in performing tasks and biological functions. Unlike traditional methods that require precise knowledge of pairwise interactions between genes, GCL can evaluate coordination without such information. The GCL value of zero signifies independent gene expression, while values above zero indicate gene-to-gene regulatory interactions. For instance, when GCL is applied to known genetic pathways in the Kyoto Encyclopedia of Genes and Genomes (KEGG) database, it yields significantly positive values, while random subsets of genes or mock pathways with similar gene expression levels show very low GCL values. Additionally, GCL can be useful in analyzing high-dimensional ecological and biochemical dynamics.

Introduction
Genes interact with each other in a complex structure known as the gene regulatory network, which plays a crucial role in implementing various biological functions and performing different tasks within cells. However, inferring the precise pairwise interactions of the gene regulatory network remains challenging due to the large number of functional genes and the inherent stochasticity of these systems. Despite these challenges, certain features of the gene regulatory network can still be extracted without fully inferring all the interactions. For instance, the network connectivity, which refers to the density of actual gene-gene interactions compared to all possible interactions, may have important implications for general cellular processes.

Method description
The calculation of the Conditional Likelihood (CL) is based on multivariate dependencies among genes in a given cohort of cells. This involves a repeated procedure of randomly selecting subsets of genes and calculating the distance correlation between them, as described in the work. By averaging over many such gene subsets, a single numerical value, known as the Gene Connectivity Landscape (GCL), is obtained to assess the overall dependencies between the genes.

However, there are several important pre- and post-processing steps that need to be taken into account to ensure the accuracy and reliability of the GCL. Firstly, clustering methods should be applied to divide the analyzed cohort of cells into subsets, and the GCL should be calculated separately for each subset or the largest one to ensure homogeneity. Secondly, cells that deviate significantly from the rest of the cells (referred to as 'outliers') or cells that are too similar to each other (referred to as 'inliers') should be filtered out to avoid their undue influence on the GCL calculation.

Additionally, jackknife analysis, which involves systematically omitting subsets of cells from the analysis and recalculating the GCL, should be performed to test the stability of the results. These steps are necessary because the GCL, like other correlation measures, can be sensitive to unusual cells and heterogeneous cohorts, especially in the context of sparse, noisy, and outlier-prone scRNA-seq data.

Applications
Aging: Stochastic aberration of transcriptional regulation is a dominant factor in the process of aging. However, when assessing GCLs in multiple single-cell RNA-sequencing datasets, the decline of GCL with age has been consistently observed across various organisms and cell types. Notably, significant decreases in GCL were found in mouse hematopoietic stem cells based on single-cell RNA-seq data, supporting the hypothesis of aging as dys-differentiation. This idea, originally posited by Richard Cutler in the 1970s, suggests that cells deviate from their proper state of differentiation as they age, as evidenced by the activation of genes that should normally be silent in aged tissues. Measuring biological variability: The GCL decreases in cohorts of cells with increased 'biological variability' only when it arises from gene interactions. The GCL can be used to assess and compare the ratio between introduced biological and technical variability in cohorts with similar cell-to-cell variability.