Median filter

The median filter is a non-linear digital filtering technique, often used to remove noise from an image or signal. Such noise reduction is a typical pre-processing step to improve the results of later processing (for example, edge detection on an image). Median filtering is very widely used in digital image processing because, under certain conditions, it preserves edges while removing noise (but see the discussion below for which kinds of noise), also having applications in signal processing.

Algorithm description
The main idea of the median filter is to run through the signal entry by entry, replacing each entry with the median of the entry and its neighboring entries. The idea is very similar to a moving average filter, which replaces each entry with the arithmetic mean of the entry and its neighbors. The pattern of neighbors is called the "window", which slides, entry by entry, over the entire signal. For one-dimensional signals, the most obvious window is just the first few preceding and following entries, whereas for two-dimensional (or higher-dimensional) data, the window must include all entries within a given radius or ellipsoidal or rectangular region (i.e., the median filter is not a separable filter).

Worked one-dimensional example
To demonstrate, using a window size of three with one entry immediately preceding and following each entry, and zero-padded boundaries, a median filter will be applied to the following simple one-dimensional signal:


 * x = (2, 3, 80, 6, 2, 3).

This signal has mainly small valued entries, except for one entry that is unusually high and considered to be a noise spike, and the aim is to eliminate it. So, the median filtered output signal y will be:


 * y0 = med(0, 2, 3) = 2, (the boundary value is taken to be 0)
 * y1 = med(2, 3, 80) = 3, (already 2, 3, and 80 are in the increasing order so no need to arrange them)
 * y2 = med(3, 80, 6) = med(3, 6, 80) = 6, (3, 80, and 6 are rearranged to find the median)
 * y3 = med(80, 6, 2) = med(2, 6, 80) = 6,
 * y4 = med(6, 2, 3) = med(2, 3, 6) = 3,
 * y5 = med(2, 3, 0) = med(0, 2, 3) = 2,

i.e.,


 * y = (2, 3, 6, 6, 3, 2).

It is clear that the noise spike has been essentially eliminated (and the signal has also been smoothed a bit). The result of a moving average filter with the same window width on the same dataset would be y = (1.7, 28.3, 29.7, 29.3, 3.7, 1.7). It can be seen that the noise spike has infected neighbouring elements in the moving average signal, and that the median filter has performed much better (for this type of impulse noise). Median filtering works well for both positive impulses (spikes) and negative impulses (dropouts), so long as a window can be chosen so that the number of entries infected with impulse noise is (almost) always smaller than half of the window size.

Boundary issues
When implementing a median filter, the boundaries of the signal must be handled with special care, as there are not enough entries to fill an entire window. There are several schemes that have different properties that might be preferred in particular circumstances:


 * When calculating the median of a value near the boundary, missing values are filled by repeating the boundary value to obtain enough entries to fill the window.
 * Avoid processing the boundaries, with or without cropping the signal or image boundary afterwards,
 * Fetching entries from other places in the signal such as values from the far ends (repeating boundary conditions) or reversing the signal (reflected boundary conditions). With 2D images for example, entries from the far horizontal or vertical boundary might be selected, or repeating in reverse order the points at the same boundary
 * Shrinking the window near the boundaries, so that every window is full,
 * Assuming zero-padded boundaries.

Two-dimensional median filter pseudo code
Code for a simple two-dimensional median filter algorithm might look like this:

1. allocate outputPixelValue[image width][image height] 2. allocate window[window width × window height] 3. edgex := (window width / 2) rounded down 4. edgey := (window height / 2) rounded down

for x from edgex to image width - edgex do for y from edgey to image height - edgey do i = 0 for fx from 0 to window width do for fy from 0 to window height do window[i] := inputPixelValue[x + fx - edgex][y + fy - edgey] i := i + 1 sort entries in window[] outputPixelValue[x][y] := window[window width * window height / 2]

This algorithm:
 * Processes one color channel only,
 * Takes the "not processing boundaries" approach (see above discussion about boundary issues).



Algorithm implementation issues
Typically, by far the majority of the computational effort and time is spent on calculating the median of each window. Because the filter must process every entry in the signal, for large signals such as images, the efficiency of this median calculation is a critical factor in determining how fast the algorithm can run. The naïve implementation described above sorts every entry in the window to find the median; however, since only the middle value in a list of numbers is required, selection algorithms can be much more efficient. Furthermore, some types of signals (very often the case for images) use whole number representations: in these cases, histogram medians can be far more efficient because it is simple to update the histogram from window to window, and finding the median of a histogram is not particularly onerous.

Worked two-dimensional example
The median filter operates by considering a local window (also known as a kernel) around each pixel in the image. The steps for applying the median filter are as follows:


 * 1) Window Selection:
 * 2) * Choose a window of a specific size (e.g., 3x3, 5x5) centered around the pixel to be filtered.
 * 3) * For our example, let’s use a 3x3 window.
 * 4) Collect Pixel Values:
 * 5) * Collect the pixel values within the window.
 * 6) * For the center pixel, we have the following values:
 * 7) ** Window: $$\begin{bmatrix} 1 & 2 & 3 \\ 4 & 8 & 6 \\ 7 & 5 & 9 \end{bmatrix}$$
 * 8) ** Center pixel: 8
 * 9) Sort the Values:
 * 10) * Sort the collected pixel values in ascending order.
 * 11) * For the center pixel, the sorted values are: [1, 2, 3, 4, 5, 6, 7, 8, 9]
 * 12) Choose the Median Value:
 * 13) * The median value is the middle value in the sorted list.
 * 14) * In our case, the median value is 5.
 * 15) Replace the Center Pixel:
 * 16) * Replace the original center pixel value (8) with the median value (5).
 * 17) Repeat for All Pixels:
 * 18) * Repeat steps 2-5 for all pixels in the image.

Converted Image
After applying the median filter to all pixels, the converted image becomes: $$ \begin{bmatrix} 2 & 3 & 3 \\ 4 & 5 & 6 \\ 7 & 7 & 8 \end{bmatrix}$$ This filtered image effectively removes noisy pixels while preserving important features. Remember that we assumed virtual rows and columns with repeated border pixel values to handle the edge pixels.

Edge preservation properties
Median filtering is one kind of smoothing technique, as is linear Gaussian filtering. All smoothing techniques are effective at removing noise in smooth patches or smooth regions of a signal, but adversely affect edges. Often though, at the same time as reducing the noise in a signal, it is important to preserve the edges. Edges are of critical importance to the visual appearance of images, for example. For small to moderate levels of Gaussian noise, the median filter is demonstrably better than Gaussian blur at removing noise whilst preserving edges for a given, fixed window size. However, its performance is not that much better than Gaussian blur for high levels of noise, whereas, for speckle noise and salt-and-pepper noise (impulsive noise), it is particularly effective. Because of this, median filtering is very widely used in digital image processing.