Land cover maps

Land cover maps are tools that provide vital information about the Earth's land use and cover patterns. They aid policy development, urban planning, and forest and agricultural monitoring.

The systematic mapping of land cover patterns, including change detection, often follows two main approaches:


 * Field survey
 * Remote sensing satellite image processing. This cost-efficient approach employs several techniques for image pre-processing and processing to accurately map land cover patterns. These techniques detect changes at various spatial scales following a series of machine learning simulations and statistical applications.

Image pre-processing is normally done through radiometric corrections, while image processing involves the application of either unsupervised or supervised classifications and vegetation indices quantification for land cover map production.

Supervised classification
A supervised classification is a system of classification in which the user builds a series of randomly generated training datasets or spectral signatures representing different land-use and land-cover (LULC) classes and applies these datasets in machine learning models to predict and spatially classify LULC patterns and evaluate classification accuracies.

Algorithms
Several machine learning algorithms have been developed for supervised classification.
 * Maximum likelihood classification (MLC) – This approach classifies overlapping signatures by estimating the probability that an image pixel with the maximum likelihood corresponds to a particular LULC type. It is also dependent on the mean and covariance matrices of training datasets and assumes statistical significance of image pixels.
 * Minimum distance (MD) – A form of supervised classification that defines decision boundaries between image pixels to classify land cover. The decision boundaries are formed by calculating the mean distance between class pixels and using the standard deviation of the generated training datasets to generate a parallelepiped box.
 * Mahalanobis distance – A system of classification that uses the Euclidean distance algorithm to assign land cover classes from a set of training datasets.
 * Spectral angler mapper (SAM) – A spectral image classification approach that uses angular measurements to determine the relationship between two spectra, treating them as vectors in a q-dimensional space, with the q-dimensions representing the number of bands.
 * Discriminant analysis (DA) – A system of classification in which the classifying algorithm separates groups of closely related image pixels into classes, minimizing the variance within classes, and maximizing the variance between classes following a maximum likelihood discriminant rule.
 * Genetic algorithm – A system of classification that applies genetic principles for selecting appropriate clusters of training data and classifying them under the influence of predictors (satellite image bands).
 * Subspace – A classification approach in which the classifier creates low dimensional subspaces of each land cover class selected from a cluster of training points. The approach of dimensional subspace creation involves performing a principal component analysis on the training points.  Two types of subspace algorithms exist for minimizing land cover classification errors: class-featuring information compression (CLAFIC) and the average learning subspace method (ALSM).
 * Parallelepiped classification – A feature space classifier that assigns range of values for each land cover class within each image band and creates bounding boxes where pixels from each land cover class are selected for training the classifier.
 * Multi-perceptron artificial neural networks (MPANNs) – A system of classification in which the classifier uses a series of neural networks or nodes to classify land cover based on backpropagations of training samples.
 * Support vector machines (SVMs) – A classification approach in which the classifier uses support vectors to obtain optimal decision boundaries separating two or more land cover classes.
 * Random forest (RF) – An approach in which the classifier uses bootstraps to create several decision trees that classify training datasets based on a number of satellite image bands.
 * K-nearest neighbors algorithm (kNN) – This approach draws k closest samples from training datasets and classifies land cover based on the distance between these samples.
 * Decision tree (DT) – Like RF, DT constitutes a set of connected nodes that partition training samples into a set of land cover clusters. Its advantages are that it is fast, easy to construct and interpret for smaller data, and good at excluding background or unimportant information. It is disadvantageous in that it can create overfitting, especially for large datasets.
 * Fuzzy clustering (FZ)

Unsupervised classification
Unsupervised classification is a system of classification in which single or groups of pixels are automatically classified by the software without the user applying signature files or training data. However, the user defines the number of classes for which the computer will automatically generate by grouping similar pixels into a single category using a clustering algorithm. This system of classification is mostly used in areas with no field observations or prior knowledge on the available land cover types.

Algorithms

 * Iterative self-organizing data analysis technique (ISODATA) – In this approach, the classifier automatically groups a number of closely related image pixels into clusters, and then computes the mean clusters and classifies land cover based on a series of repeated iterations.
 * K-means clustering – An approach in which the computer automatically extracts k land cover features from satellite images, and classifies the overall image based on the calculated means of the extracted features.

Vegetation indices classification
Vegetation indices classification is a system in which two or more spectral bands are combined through defined statistical algorithms to reflect the spatial properties of a vegetation cover.

Most of these indices make use of the relationship between red and near-infrared (NIR) bands of satellite images to generate vegetation properties. Several vegetation indices have been developed; scientists apply these via remote sensing to effectively classify forest cover and land use patterns.

These spectral indices use two or more bands to accurately acquire surface reflectance of land features, thereby improving classification accuracy.

Vegetation indices

 * Normalized difference vegetation index (NDVI) – Defined as the ratio between the red and near-infrared (NIR) bands of satellite images. It is calculated as:
 * $$\text{NVDI} = {(\text{NIR} - \text{Red}) \over (\text{NIR} + \text{Red})}$$
 * This index measures vegetation greenness, with values ranging between -1 and 1. High NDVI values represent dense vegetation cover, moderate NDVI values represent sparse vegetation cover, and low NDVI values correspond to non-vegetated areas (e.g., barren or bare lands).


 * Enhanced vegetation index (EVI) – Defined as the ratio between the red, NIR, and blue bands, with a gain factor (G), soil brightness correction factor (L) and atmospheric aerosol correction factors (C). It is calculated as:
 * $$G \times {(\text{NIR}-\text{Red}) \over (\text{NIR} + C_1 \times \text{Red} - C_2 \times \text{Blue} + L)} $$
 * with usually default values of L = 0.5 and G = 2.5.


 * Soil adjusted vegetation index (SAVI) – Defined as the ratio between the red and NIR values with a soil brightness correction factor (L). It is calculated as:
 * $$\text{SAVI} = (1 + L) \times {(\text{NIR} - \text{Red}) \over (\text{NIR} + \text{Red} + L)}$$


 * Canopy shadow index (SI) – Defined as the square root of the red and green bands of satellite images. It evaluates the different shadow patterns of forest canopies based on age, structure, and composition, as well as easily differentiates dense forests from grass and bare lands. It is calculated as:
 * $$\text{SI} = \sqrt[]{(256-\text{Green}) \times (256-\text{Red})}$$
 * where both red and green range between 0 and 256.


 * Advanced vegetation index (AVI) – Used to differentiate forest cover from grassland and bare land areas. It is calculated as:
 * $$\text{AVI} = \sqrt[3]{(\text{NIR} + 1) \times (\text{256} - \text{Red}) \times (\text{NIR} - \text{Red})}$$
 * where red ranges between 0 and 256.


 * Bare soil index (BSI) – Defined as the ratio between the NIR, red, and blue bands of satellite images. It measures the amount of bare soil and as such increases with decrease forest density. It is calculated as:
 * $$\text{BSI} = {(\text{NIR}+\text{Green})- \text{Red} \over (\text{NIR}+\text{Green})+ \text{Red}}$$


 * Normalized differential water index (NDWI) – Developed for quantifying the water content of plants and other earth system features, using short-wave infrared (SWIR). It is calculated as:
 * $$\text{NDWI} = {\text{NIR} - \text{SWIR} \over \text{NIR} + \text{SWIR}}$$


 * Normalized differential built-up index (NDBI) – Developed for quantifying built-up areas in satellite images. It is calculated as:
 * $$\text{NDBI} = {\text{SWIR} - \text{NIR} \over \text{SWIR} + \text{NIR}}$$