Box counting



Box counting is a method of gathering data for analyzing complex patterns by breaking a dataset, object, image, etc. into smaller and smaller pieces, typically "box"-shaped, and analyzing the pieces at each smaller scale. The essence of the process has been compared to zooming in or out using optical or computer based methods to examine how observations of detail change with scale. In box counting, however, rather than changing the magnification or resolution of a lens, the investigator changes the size of the element used to inspect the object or pattern (see Figure 1). Computer based box counting algorithms have been applied to patterns in 1-, 2-, and 3-dimensional spaces. The technique is usually implemented in software for use on patterns extracted from digital media, although the fundamental method can be used to investigate some patterns physically. The technique arose out of and is used in fractal analysis. It also has application in related fields such as lacunarity and multifractal analysis.

The method
Theoretically, the intent of box counting is to quantify fractal scaling, but from a practical perspective this would require that the scaling be known ahead of time. This can be seen in Figure 1 where choosing boxes of the right relative sizes readily shows how the pattern repeats itself at smaller scales. In fractal analysis, however, the scaling factor is not always known ahead of time, so box counting algorithms attempt to find an optimized way of cutting a pattern up that will reveal the scaling factor. The fundamental method for doing this starts with a set of measuring elements—boxes—consisting of an arbitrary number, called $$\Epsilon$$ here for convenience, of sizes or calibres, which we will call the set of $$\epsilon$$s. Then these $$\epsilon$$-sized boxes are applied to the pattern and counted. To do this, for each $$\epsilon$$ in $$\Epsilon$$, a measuring element that is typically a 2-dimensional square or 3-dimensional box with side length corresponding to $$\epsilon$$ is used to scan a pattern or data set (e.g., an image or object) according to a predetermined scanning plan to cover the relevant part of the data set, recording, i.e.,counting, for each step in the scan relevant features captured within the measuring element.

The data
The relevant features gathered during box counting depend on the subject being investigated and the type of analysis being done. Two well-studied subjects of box counting, for instance, are binary (meaning having only two colours, usually black and white) and gray-scale digital images (i.e., jpegs, tiffs, etc.). Box counting is generally done on patterns extracted from such still images in which case the raw information recorded is typically based on features of pixels such as a predetermined colour value or range of colours or intensities. When box counting is done to determine a fractal dimension known as the box counting dimension, the information recorded is usually either yes or no as to whether or not the box contained any pixels of the predetermined colour or range (i.e., the number of boxes containing relevant pixels at each $$\epsilon$$ is counted). For other types of analysis, the data sought may be the number of pixels that fall within the measuring box, the range or average values of colours or intensities, the spatial arrangement amongst pixels within each box, or properties such as average speed (e.g., from particle flow).

Scan types
Every box counting algorithm has a scanning plan that describes how the data will be gathered, in essence, how the box will be moved over the space containing the pattern. A variety of scanning strategies has been used in box counting algorithms, where a few basic approaches have been modified in order to address issues such as sampling, analysis methods, etc.





Fixed grid scans
The traditional approach is to scan in a non-overlapping regular grid or lattice pattern. To illustrate, Figure 2a shows the typical pattern used in software that calculates box counting dimensions from patterns extracted into binary digital images of contours such as the fractal contour illustrated in Figure 1 or the classic example of the coastline of Britain often used to explain the method of finding a box counting dimension. The strategy simulates repeatedly laying a square box as though it were part of a grid overlaid on the image, such that the box for each $$\epsilon$$ never overlaps where it has previously been (see Figure 4). This is done until the entire area of interest has been scanned using each $$\epsilon$$ and the relevant information has been recorded. When used to find a box counting dimension, the method is modified to find an optimal covering.

Sliding box scans
Another approach that has been used is a sliding box algorithm, in which each box is slid over the image overlapping the previous placement. Figure 2b illustrates the basic pattern of scanning using a sliding box. The fixed grid approach can be seen as a sliding box algorithm with the increments horizontally and vertically equal to $$\epsilon$$. Sliding box algorithms are often used for analyzing textures in lacunarity analysis and have also been applied to multifractal analysis.

Subsampling and local dimensions
Box counting may also be used to determine local variation as opposed to global measures describing an entire pattern. Local variation can be assessed after the data have been gathered and analyzed (e.g., some software colour codes areas according to the fractal dimension for each subsample), but a third approach to box counting is to move the box according to some feature related to the pixels of interest. In local connected dimension box counting algorithms, for instance, the box for each $$\epsilon$$ is centred on each pixel of interest, as illustrated in Figure 2c.

Methodological considerations
The implementation of any box counting algorithm has to specify certain details such as how to determine the actual values in $$\Epsilon$$, including the minimum and maximum sizes to use and the method of incrementing between sizes. Many such details reflect practical matters such as the size of a digital image but also technical issues related to the specific analysis that will be performed on the data. Another issue that has received considerable attention is how to approximate the so-called "optimal covering" for determining box counting dimensions and assessing multifractal scaling.

Edge effects
One known issue in this respect is deciding what constitutes the edge of the useful information in a digital image, as the limits employed in the box counting strategy can affect the data gathered.

Scaling box size
The algorithm has to specify the type of increment to use between box sizes (e.g., linear vs exponential), which can have a profound effect on the results of a scan.

Grid orientation
As Figure 4 illustrates, the overall positioning of the boxes also influences the results of a box count. One approach in this respect is to scan from multiple orientations and use averaged or optimized data.

To address various methodological considerations, some software is written so users can specify many such details, and some includes methods such as smoothing the data after the fact to be more amenable to the type of analysis being done.