Biweight midcorrelation

In statistics, biweight midcorrelation (also called bicor) is a measure of similarity between samples. It is median-based, rather than mean-based, thus is less sensitive to outliers, and can be a robust alternative to other similarity metrics, such as Pearson correlation or mutual information.

Derivation
Here we find the biweight midcorrelation of two vectors $$x$$ and $$y$$, with $$i=1,2, \ldots,m$$ items, representing each item in the vector as $$x_1, x_2, \ldots, x_m$$ and $$y_1, y_2, \ldots, y_m$$. First, we define $$\operatorname{med}(x)$$ as the median of a vector $$x$$ and $$\operatorname{mad}(x)$$ as the median absolute deviation (MAD), then define $$u_i$$ and $$v_i$$ as,



\begin{align} u_i &= \frac{x_i - \operatorname{med}(x)}{9 \operatorname{mad}(x)},\\ v_i &= \frac{y_i - \operatorname{med}(y)}{9 \operatorname{mad}(y)}. \end{align} $$

Now we define the weights $$w_i^{(x)}$$ and $$w_i^{(y)}$$ as,



\begin{align} w_i^{(x)} &= \left(1-u_i^2\right)^2 I\left(1-|u_i|\right)\\ w_i^{(y)} &= \left(1-v_i^2\right)^2 I\left(1-|v_i|\right) \end{align} $$

where $$I$$ is the identity function where,



I(x) = \begin{cases}1, & \text{if } x >0\\ 0, & \text{otherwise}\end{cases} $$

Then we normalize so that the sum of the weights is 1:



\begin{align} \tilde{x}_i &= \frac{\left(x_i - \operatorname{med}(x)\right) w_i^{(x)}}{\sqrt{\sum_{j=1}^m \left[(x_j -\operatorname{med}(x)) w_j^{(x)}\right]^2}}\\ \tilde{y}_i &= \frac{\left(y_i - \operatorname{med}(y)\right) w_i^{(y)}}{\sqrt{\sum_{j=1}^m \left[(y_j -\operatorname{med}(y)) w_j^{(y)}\right]^2}}. \end{align} $$

Finally, we define biweight midcorrelation as,



\mathrm{bicor}\left(x, y\right) = \sum_{i=1}^m \tilde{x}_i \tilde{y}_i $$

Applications
Biweight midcorrelation has been shown to be more robust in evaluating similarity in gene expression networks, and is often used for weighted correlation network analysis.

Implementations
Biweight midcorrelation has been implemented in the R statistical programming language as the function  as part of the WGCNA package

Also implemented in the Raku programming language as the function  as part of the Statistics module.