Structural information theory

Structural information theory (SIT) is a theory about human perception and in particular about visual perceptual organization, which is a neuro-cognitive process. It has been applied to a wide range of research topics, mostly in visual form perception but also in, for instance, visual ergonomics, data visualization, and music perception.

SIT began as a quantitative model of visual pattern classification. Nowadays, it includes quantitative models of symmetry perception and amodal completion, and is theoretically sustained by a perceptually adequate formalization of visual regularity, a quantitative account of viewpoint dependencies, and a powerful form of neurocomputation. SIT has been argued to be the best defined and most successful extension of Gestalt ideas. It is the only Gestalt approach providing a formal calculus that generates plausible perceptual interpretations.

The simplicity principle
A simplest code is a code with minimum information load, that is, a code that enables a reconstruction of the stimulus using a minimum number of descriptive parameters. Such a code is obtained by capturing a maximum amount of visual regularity and yields a hierarchical organization of the stimulus in terms of wholes and parts.

The assumption that the visual system prefers simplest interpretations is called the simplicity principle. Historically, the simplicity principle is an information-theoretical translation of the Gestalt law of Prägnanz, which was inspired by the natural tendency of physical systems to settle into relatively stable states defined by a minimum of free-energy. Furthermore, just as the later-proposed minimum description length principle in algorithmic information theory (AIT), a.k.a. the theory of Kolmogorov complexity, it can be seen as a formalization of Occam's Razor, according to which the simplest interpretation of data is the best one.

Simplicity versus likelihood
Crucial to the latter finding is the distinction between, and integration of, viewpoint-independent and viewpoint-dependent factors in vision, as proposed in SIT's empirically successful model of amodal completion. In the Bayesian framework, these factors correspond to prior probabilities and conditional probabilities, respectively. In SIT's model, however, both factors are quantified in terms of complexities, that is, complexities of objects and of their spatial relationships, respectively.

Modeling principles
In SIT's formal coding model, candidate interpretations of a stimulus are represented by symbol strings, in which identical symbols refer to identical perceptual primitives (e.g., blobs or edges). Every substring of such a string represents a spatially contiguous part of an interpretation, so that the entire string can be read as a reconstruction recipe for the interpretation and, thereby, for the stimulus. These strings then are encoded (i.e., they are searched for visual regularities) to find the interpretation with the simplest code.

This encoding is performed by way of symbol manipulation, which, in psychology, has led to critical statements of the sort of "SIT assumes that the brain performs symbol manipulation". Such statements, however, fall in the same category as statements such as "physics assumes that nature applies formulas such as Einstein's E=mc2 or Newton's F=ma" and "DST models assume that dynamic systems apply differential equations".

Visual regularity
To obtain simplest codes, SIT applies coding rules that capture the kinds of regularity called iteration, symmetry, and alternation. These have been shown to be the only regularities that satisfy the formal criteria of (a) being holographic regularities that (b) allow for hierarchically transparent codes.

A crucial difference with respect to the traditionally considered transformational formalization of visual regularity is that, holographically, mirror symmetry is composed of many relationships between symmetry pairs rather than one relationship between symmetry halves. Whereas the transformational characterization may be suited better for object recognition, the holographic characterization seems more consistent with the buildup of mental representations in object perception.

The perceptual relevance of the criteria of holography and transparency has been verified in the holographic approach to visual regularity. It also explains that the detectability of mirror symmetries and Glass pattens in the presence of noise follows a psychophysical law that improves on Weber's law.