Getis–Ord statistics

Getis–Ord statistics, also known as Gi*, are used in spatial analysis to measure the local and global spatial autocorrelation. Developed by statisticians Arthur Getis and J. Keith Ord they are commonly used for Hot Spot Analysis to identify where features with high or low values are spatially clustered in a statistically significant way. Getis-Ord statistics are available in a number of software libraries such as CrimeStat, GeoDa, ArcGIS, PySAL and R.

Local statistics
There are two different versions of the statistic, depending on whether the data point at the target location $$i$$ is included or not


 * $$ G_i = \frac{ \sum_{j \neq i} w_{ij} x_j}{\sum_{j \neq i} x_j} $$


 * $$ G_i^* = \frac{ \sum_{j} w_{ij} x_j}{\sum_{j} x_j} $$

Here $$x_i$$ is the value observed at the $$i^{th}$$ spatial site and $$w_{ij}$$ is the spatial weight matrix which constrains which sites are connected to one another. For $$G_i^*$$ the denominator is constant across all observations.

A value larger (or smaller) than the mean suggests a hot (or cold) spot corresponding to a high-high (or low-low) cluster. Statistical significance can be estimated using analytical approximations as in the original work however in practice permutation testing is used to obtain more reliable estimates of significance for statistical inference.

Global statistics
The Getis-Ord statistics of overall spatial association are


 * $$ G = \frac{ \sum_{ij, i\neq j} w_{ij} x_i x_j}{\sum_{ij, i\neq j} x_i x_j}$$
 * $$ G^* = \frac{ \sum_{ij} w_{ij} x_i x_j}{\sum_{ij} x_i x_j}$$

The local and global $$G^*$$ statistics are related through the weighted average
 * $$ \frac{ \sum_i x_i G^*_i }{ \sum_i x_i } = \frac{ \sum_{ij} x_i w_{ij} x_j }{ \sum_i x_i \sum_j x_j } = G^* $$

The relationship of the $$G$$ and $$G_i$$ statistics is more complicated due to the dependence of the denominator of $$G_i$$ on $$i$$.

Relation to Moran's I
Moran's I is another commonly used measure of spatial association defined by

I = \frac{N}{W} \frac{\sum_{ij} w_{ij}(x_i - \bar{x})(x_j - \bar{x})}{\sum_{i} (x_i - \bar{x})^2} $$ where $$N$$ is the number of spatial sites and $$W = \sum_{ij} w_{ij}$$ is the sum of the entries in the spatial weight matrix. Getis and Ord show that

I = (K_1/K_2) G - K_2 \bar{x} \sum_i (w_{i \cdot} + w_{\cdot i}) x_i + K_2 \bar{x}^2 W $$ Where $$w_{i \cdot} = \sum_j w_{ij}$$, $$w_{\cdot i} = \sum_j w_{ji}$$, $$K_1 = \left( \sum_{ij, i\neq j} x_i x_j \right)^{-1}$$ and $$K_2 = \frac{W}{N}\left(\sum_{i} (x_i - \bar{x})^2\right)^{-1}$$. They are equal if $$w_{ij}=w$$ is constant, but not in general.

Ord and Getis also show that Moran's I can be written in terms of $$G_i^*$$

I = \frac{1}{W} \left( \sum_i z_i V_i G_i^* - N\right) $$ where $$z_i = (x_i - \bar{x})/s$$, $$s$$ is the standard deviation of $$x$$ and

V_i^2 = \frac{1}{N-1}\sum_j \left( w_{ij} - \frac{1}{N} \sum_k w_{ik}\right)^2 $$ is an estimate of the variance of $$w_{ij}$$.