Lee's L

Lee's L is a bivariate spatial correlation coefficient which measures the association between two sets of observations made at the same spatial sites. Standard measures of association such as the Pearson correlation coefficient do not account for the spatial dimension of data, in particular they are vulnerable to inflation due to spatial autocorrelation. Lee's L is available in numerous spatial analysis software libraries including spdep and PySAL (where it is called Spatial_Pearson) and has been applied in diverse applications such as studying air pollution, viticulture and housing rent.

For spatial data $$x_i$$ and $$y_i$$ measured at $$N$$ locations connected with the spatial weight matrix $$w_{ij}$$ first define the spatially lagged vector


 * $$\tilde{x}_i = \sum_j w_{ij} x_j$$

with a similar definition for $$\tilde{y}_i$$. Then Lee's L is defined as



L_{x,y} = \frac{N}{\sum_i \left( \sum_j w_{ij} \right)^2} \frac{\sum_{ij} (\tilde{x}_i - \bar{x})(\tilde{y}_i - \bar{y}) }{ \sqrt{ \sum_i (\tilde{x}_i - \bar{x})^2} \sqrt{ \sum_i (\tilde{y}_i - \bar{y})^2} } $$

where $$\bar{x}, \bar{y}$$ are the mean values of $$x_i, y_i$$. When the spatial weight matrix is row normalized, such that $$\sum_j w_{ij} = 1$$, the first factor is 1.

Lee also defines the spatial smoothing scalar

SSS_{x} = \frac{ \sum_i (\tilde{x}_i - \bar{x})^2}{\sum_i (x_i - \bar{x})^2} $$ to measure the spatial autocorrelation of a variable.

It is shown by Lee that the above definition is equivalent to

L_{x,y} = \sqrt{ SSS_{x} } \sqrt{ SSS_{y} } r( \tilde{x}, \tilde{y} ) $$ Where $$r$$ is the Pearson correlation coefficient

r(\tilde{x}, \tilde{y}) =\frac{\sum ^n _{i=1}(\tilde{x}_i - \bar{\tilde{x}})(\tilde{y}_i - \bar{\tilde{y}})}{\sqrt{\sum ^n _{i=1}(\tilde{x}_i - \bar{\tilde{x}})^2} \sqrt{\sum ^n _{i=1}(\tilde{y}_i - \bar{\tilde{y}})^2}} $$ This means Lee's L is equivalent to the Pearson correlation of the spatially lagged data, multiplied by a measure of each data set's spatial autocorrelation.