Hessian affine region detector

The Hessian affine region detector is a feature detector used in the fields of computer vision and image analysis. Like other feature detectors, the Hessian affine detector is typically used as a preprocessing step to algorithms that rely on identifiable, characteristic interest points.

The Hessian affine detector is part of the subclass of feature detectors known as affine-invariant detectors: Harris affine region detector, Hessian affine regions, maximally stable extremal regions, Kadir–Brady saliency detector, edge-based regions (EBR) and intensity-extrema-based (IBR) regions.

Algorithm description
The Hessian affine detector algorithm is almost identical to the Harris affine region detector. In fact, both algorithms were derived by Krystian Mikolajczyk and Cordelia Schmid in 2002, based on earlier work in, see also for a more general overview.

How does the Hessian affine differ?
The Harris affine detector relies on interest points detected at multiple scales using the Harris corner measure on the second-moment matrix. The Hessian affine also uses a multiple scale iterative algorithm to spatially localize and select scale and affine invariant points. However, at each individual scale, the Hessian affine detector chooses interest points based on the Hessian matrix at that point:

$$ H(\mathbf{x}) = \begin{bmatrix} L_{xx}(\mathbf{x}) & L_{xy}(\mathbf{x})\\ L_{yx}(\mathbf{x}) & L_{yy}(\mathbf{x})\\ \end{bmatrix} $$

where $$L_{aa}(\mathbf{x})$$ is second partial derivative in the $$a$$ direction and $$L_{ab}(\mathbf{x})$$ is the mixed partial second derivative in the $$a$$ and $$b$$ directions. It's important to note that the derivatives are computed in the current iteration scale and thus are derivatives of an image smoothed by a Gaussian kernel: $$L(\mathbf{x}) = g(\sigma_I) \otimes I(\mathbf{x}) $$. As discussed in the Harris affine region detector article, the derivatives must be scaled appropriately by a factor related to the Gaussian kernel: $$\sigma_I^2$$.

At each scale, interest points are those points that simultaneously are local extrema of both the determinant and trace of the Hessian matrix. The trace of Hessian matrix is identical to the Laplacian of Gaussians (LoG):

$$\begin{align} DET = \sigma_I^2 ( L_{xx}L_{yy}(\mathbf{x}) - L_{xy}^2(\mathbf{x})) \\ TR = \sigma_I (L_{xx} + L_{yy}) \end{align} $$

As discussed in Mikolajczyk et al.(2005), by choosing points that maximize the determinant of the Hessian, this measure penalizes longer structures that have small second derivatives (signal changes) in a single direction. This type of measure is very similar to the measures used in the blob detection schemes proposed by Lindeberg (1998), where either the Laplacian or the determinant of the Hessian were used in blob detection methods with automatic scale selection.

Like the Harris affine algorithm, these interest points based on the Hessian matrix are also spatially localized using an iterative search based on the Laplacian of Gaussians. Predictably, these interest points are called Hessian–Laplace interest points. Furthermore, using these initially detected points, the Hessian affine detector uses an iterative shape adaptation algorithm to compute the local affine transformation for each interest point. The implementation of this algorithm is almost identical to that of the Harris affine detector; however, the above mentioned Hessian measure replaces all instances of the Harris corner measure.

Robustness to affine and other transformations
Mikolajczyk et al. (2005) have done a thorough analysis of several state of the art affine region detectors: Harris affine, Hessian affine, MSER, IBR & EBR and salient detectors. Mikolajczyk et al. analyzed both structured images and textured images in their evaluation. Linux binaries of the detectors and their test images are freely available at their webpage. A brief summary of the results of Mikolajczyk et al. (2005) follow; see A comparison of affine region detectors for a more quantitative analysis.

Overall, the Hessian affine detector performs second best to MSER. Like the Harris affine detector, Hessian affine interest regions tend to be more numerous and smaller than other detectors. For a single image, the Hessian affine detector typically identifies more reliable regions than the Harris-Affine detector. The performance changes depending on the type of scene being analyzed. The Hessian affine detector responds well to textured scenes in which there are a lot of corner-like parts. However, for some structured scenes, like buildings, the Hessian affine detector performs very well. This is complementary to MSER that tends to do better with well structured (segmentable) scenes.

Software packages

 * Affine Covariant Features: K. Mikolajczyk maintains a web page that contains Linux binaries of the Hessian-Affine detector in addition to other detectors and descriptors. Matlab code is also available that can be used to illustrate and compute the repeatability of various detectors.  Code and images are also available to duplicate the results found in the Mikolajczyk et al. (2005) paper.
 * lip-vireo : – binary code for Linux, Windows and SunOS from VIREO research group, see more from the homepage