Georeferencing

Georeferencing or georegistration is a type of coordinate transformation that binds a digital raster image or vector database that represents a geographic space (usually a scanned map or aerial photograph) to a spatial reference system, thus locating the digital data in the real world. It is thus the geographic form of image registration. The term can refer to the mathematical formulas used to perform the transformation, the metadata stored alongside or within the image file to specify the transformation, or the process of manually or automatically aligning the image to the real world to create such metadata. The most common result is that the image can be visually and analytically integrated with other geographic data in geographic information systems and remote sensing software.

A number of mathematical methods are available, but the process typically involves identifying several sample ground control points with known locations on the image and the ground, then using curve fitting techniques to generate a parametric (or piecewise parametric) formula to transform the rest of the image. Once the parameters of the formula are stored, the image may be transformed dynamically at drawing time, or resampled to generate a georeferenced raster GIS file or orthophoto.

The term georeferencing has also been used to refer to other types of transformation from general expressions of geographic location (geocodes) to coordinate measurements, but most of these other methods are more commonly called geocoding. Because of this ambiguity, Georegistration is preferred by some to refer to the image transformation. Occasionally, this process has been called rubbersheeting, but that term is more commonly applied to a very similar process applied to vector GIS data.

Motivation

 * Georeferencing is crucial to make aerial and satellite imagery, usually raster images, useful for mapping as it explains how other data, such as the above GPS points, relate to the imagery.
 * Very essential information may be contained in data or images that were produced at a different point of time. It may be desired either to combine or compare this data with that currently available. The latter can be used to analyze the changes in the features under study over a period of time.
 * Different maps may use different projection systems. Georeferencing tools contain methods to combine and overlay these maps with minimum distortion.

Mathematics
The registration of an image to a geographic space is essentially the transformation from an input coordinate system (the inherent coordinates of pixels in the images based on row and column number) to an output coordinate system, a spatial reference system of the user's choice, such as the geographic coordinate system or a particular Universal Transverse Mercator zone. It is thus the extension of the typical task of curve fitting a relationship between two variables to four dimensions. The goal is to have a pair of functions of the form:


 * $$x_{out} = F(x_{in}, y_{in})$$
 * $$y_{out} = G(x_{in}, y_{in})$$

Such that for every pixel in the image ($$x_{in}, y_{in}$$ being its column and row number, respectively), a corresponding real-world coordinate can be calculated.

Several types of functions are available in most GIS and remote sensing software for georeferencing. As the simplest type of two-dimensional curve is a straight line, so the simplest form of coordinate transformation is a linear transformation, the most common type being the affine transformation:


 * $$x_{out} = Ax_{in} + By_{in} + C$$
 * $$y_{out} = Dx_{in} + Ey_{in} + F$$

Where A-F are constant coefficients set for the entire image. These formulas allow an image to be moved (the C and F coefficients specify the desired location of the top left corner of the image), scaled (without rotation, the A and E coefficients specify the size of each cell or spatial resolution), and rotated. In the last case, if the cell size is r in both the x and y directions, and the image is to be rotated α degrees counter-clockwise, then $$A = E = r\cos(\alpha), B = D = r\sin(\alpha)$$. The world file developed by Esri is a commonly used sidecar file that specifies these six coefficients for image georeferencing.

Higher order polynomial transformations are also commonly used. For example, a Second-order polynomial transformation would be:


 * $$x_{out} = Ax_{in} + By_{in} + Cx_{in}^2 + Dy_{in}^2 + Ex_{in}y_{in} + F$$
 * $$y_{out} = Gx_{in} + Hy_{in} + Ix_{in}^2 + Jy_{in}^2 + Kx_{in}y_{in} + L$$

The second-order terms (and third-order terms in a third-order polynomial) allow for the variable warping of the image, which is especially useful for removing the inherent distortion in aerial photographs.

In addition to global parametric formulas, piecewise formulas can also be used, which transform different parts of the image in different ways. A common example is a Thin plate spline transformation.

The GCP method
It is very rare that a user would specify the parameters for the transformation directly. Instead, most GIS and remote sensing software provides an interactive environment for visually aligning the image to the destination coordinate system. The most common method for doing this is to create a series of ground control points (GCP). A ground control point is a location that can be identified on both the image and the ground, so that it has precise coordinates in both the image coordinate system ($$x_{in}$$ = pixel column, $$y_{in}$$ = pixel row) and the ground coordinate system ($$x_{out}, y_{out}$$). Easily visible locations that be precisely located are preferred as GCP's, such as a road intersection or the corner of a building. When very high accuracy registration is required, it is common to place or paint high-contrast markers on the ground at survey control monuments before the photography is taken, and use GNSS-measured coordinates for the output. In most software, these are entered by pointing at the location on the image, then pointing at the same location on a vector base map or orthophoto that is already in the desired coordinate system. This can then be moved and adjusted to improve accuracy.

With a minimal set of GCPs, the known coordinates can be entered into the mathematical equations for the desired type of transformation, which can then be solved using linear algebra to determine the coefficients and derive the formulas to use for the entire grid. For example, the linear affine transformation above has six unknown coefficients, so six equations with known <$$x_{in}, y_{in}, x_{out}, y_{out}$$> are needed to derive them, which will require three ground control points. The second-order polynomial requires a minimum of six ground control points, and so on.

The entered GCPs are rarely perfectly located and are even more rarely perfectly representative of the distortion in the rest of the image, but the algebraic solution, which appears to be a perfect match, masks any error. To avoid this, it is common to create many more than the minimal required set (creating an overdetermined system), and use least squares regression to derive a set of function parameters that most closely matches the points. This is almost never a perfect match, so the variance between each GCP location and the location predicted by the functions can be measured and summarized as a Root-mean-square error (RMSE). A lower RMSE thus means that the transformation formulas closely match the GCPs.

Once the function parameters are determined, the transformation functions can be used to transform every pixel of the image to its real-world location. Two options are usually available for making this transformation permanent. One option is to save the parameters themselves as a form of metadata, either in the header of the image file itself (e.g., GeoTIFF), or in a sidecar file stored alongside the image file (e.g., a world file). With this metadata, the software can perform the transformation dynamically as it displays the image, so that it appears to align with other data in the desired coordinate system. The alternative method is rectification, in which the image is resampled to create a new raster grid that is natively tied to the coordinate system. Rectification was traditionally the only option, until the computing power became available for the intense calculations of dynamic coordinate transformations; even now, drawing and analysis performance is better with a rectified image.

Software implementations

 * Esri GIS software has had this capability for many years, including the Georeferencing tool in ArcGIS Pro.
 * QGIS has a Georeferencer tool, originally developed as an add-on but now integrated into the software.
 * Image Georeferencing and Rectification in ERDAS Imagine
 * Image to Map Registration in ENVI
 * Allmaps supports various transformation algorithms (including the Thin Plate Spline transformation), and the computation of distortions, through it's @allmaps/transform package.