Motion perception

Motion perception is the process of inferring the speed and direction of elements in a scene based on visual, vestibular and proprioceptive inputs. Although this process appears straightforward to most observers, it has proven to be a difficult problem from a computational perspective, and difficult to explain in terms of neural processing.

Motion perception is studied by many disciplines, including psychology (i.e. visual perception), neurology, neurophysiology, engineering, and computer science.

Neuropsychology
The inability to perceive motion is called akinetopsia and it may be caused by a lesion to cortical area V5 in the extrastriate cortex. Neuropsychological studies of a patient who could not see motion, seeing the world in a series of static "frames" instead, suggested that visual area V5 in humans is homologous to motion processing area V5/MT in primates.

First-order motion perception


Two or more stimuli that are switched on and off in alternation can produce two different motion percepts. The first, demonstrated in the figure to the right is "beta movement", and the basis for the technology of electronic news ticker displays. However, at faster alternation rates, and if the distance between the stimuli is just right, an illusory "object" the same colour as the background is seen moving between the two stimuli and alternately occluding them. This is called the phi phenomenon and is sometimes described as an example of "pure" motion detection uncontaminated, as in Beta movement, by form cues. This description is, however, somewhat paradoxical as it is not possible to create such motion in the absence of figural percepts.

The phi phenomenon has been referred to as "first-order" motion perception. Werner E. Reichardt and Bernard Hassenstein have modelled it in terms of relatively simple "motion sensors" in the visual system, that have evolved to detect a change in luminance at one point on the retina and correlate it with a change in luminance at a neighbouring point on the retina after a short delay. Sensors that are proposed to work this way have been referred to as either Hassenstein-Reichardt detectors after the scientists Bernhard Hassenstein and Werner Reichardt, who first modelled them, motion-energy sensors, or Elaborated Reichardt Detectors. These sensors are described as detecting motion by spatio-temporal correlation and are considered by some to be plausible models for how the visual system may detect motion. (Although, again, the notion of a "pure motion" detector suffers from the problem that there is no "pure motion" stimulus, i.e. a stimulus lacking perceived figure/ground properties). There is still considerable debate regarding the accuracy of the model and exact nature of this proposed process. It is not clear how the model distinguishes between movements of the eyes and movements of objects in the visual field, both of which produce changes in luminance on points on the retina.

Second-order motion perception
Second-order motion is when the moving contour is defined by contrast, texture, flicker or some other quality that does not result in an increase in luminance or motion energy in the Fourier spectrum of the stimulus. There is much evidence to suggest that early processing of first- and second-order motion is carried out by separate pathways. Second-order mechanisms have poorer temporal resolution and are low-pass in terms of the range of spatial frequencies to which they respond. (The notion that neural responses are attuned to frequency components of stimulation suffers from the lack of a functional rationale and has been generally criticized by G. Westheimer (2001) in an article called "The Fourier Theory of Vision.") Second-order motion produces a weaker motion aftereffect unless tested with dynamically flickering stimuli.

The aperture problem
The motion direction of a contour is ambiguous, because the motion component parallel to the line cannot be inferred based on the visual input. This means that a variety of contours of different orientations moving at different speeds can cause identical responses in a motion sensitive neuron in the visual system.

Motion integration
Some have speculated that, having extracted the hypothesized motion signals (first- or second-order) from the retinal image, the visual system must integrate those individual local motion signals at various parts of the visual field into a 2-dimensional or global representation of moving objects and surfaces. (It is not clear how this 2D representation is then converted into the perceived 3D percept) Further processing is required to detect coherent motion or "global motion" present in a scene.

The ability of a subject to detect coherent motion is commonly tested using motion coherence discrimination tasks. For these tasks, dynamic random-dot patterns (also called random dot kinematograms) are used that consist in 'signal' dots moving in one direction and 'noise' dots moving in random directions. The sensitivity to motion coherence is assessed by measuring the ratio of 'signal' to 'noise' dots required to determine the coherent motion direction. The required ratio is called the motion coherence threshold.

Motion in depth
As in other aspects of vision, the observer's visual input is generally insufficient to determine the true nature of stimulus sources, in this case their velocity in the real world. In monocular vision for example, the visual input will be a 2D projection of a 3D scene. The motion cues present in the 2D projection will by default be insufficient to reconstruct the motion present in the 3D scene. Put differently, many 3D scenes will be compatible with a single 2D projection. The problem of motion estimation generalizes to binocular vision when we consider occlusion or motion perception at relatively large distances, where binocular disparity is a poor cue to depth. This fundamental difficulty is referred to as the inverse problem.

Nonetheless, some humans do perceive motion in depth. There are indications that the brain uses various cues, in particular temporal changes in disparity as well as monocular velocity ratios, for producing a sensation of motion in depth. Two different binocular cues of the perception motion in depth are hypothesized: Inter-ocular velocity difference (IOVD) and changing disparity (CD) over time. Motion in depth based on inter-ocular velocity differences can be tested using dedicated binocularly uncorrelated random-dot kinematograms. Study results indicate that the processing of these two binocular cues – IOVD and CD – may use fundamentally different low-level stimulus features, which may be processed jointly that later stages. Additionally, as monocular cue, also the changing size of retinal images contributes to motion in depth detection.

Perceptual learning of motion
Detection and discrimination of motion can be improved by training with long-term results. Participants trained to detect the movements of dots on a screen in only one direction become particularly good at detecting small movements in the directions around that in which they have been trained. This improvement was still present 10 weeks later. However perceptual learning is highly specific. For example, the participants show no improvement when tested around other motion directions, or for other sorts of stimuli.

Cognitive map
A cognitive map is a type of mental representation which serves an individual to acquire, code, store, recall, and decode information about the relative locations and attributes of phenomena in their spatial environment. Place cells work with other types of neurons in the hippocampus and surrounding regions of the brain to perform this kind of spatial processing, but the ways in which they function within the hippocampus are still being researched.

Many species of mammals can keep track of spatial location even in the absence of visual, auditory, olfactory, or tactile cues, by integrating their movements—the ability to do this is referred to in the literature as path integration. A number of theoretical models have explored mechanisms by which path integration could be performed by neural networks. In most models, such as those of Samsonovich and McNaughton (1997) or Burak and Fiete (2009), the principal ingredients are (1) an internal representation of position, (2) internal representations of the speed and direction of movement, and (3) a mechanism for shifting the encoded position by the right amount when the animal moves. Because cells in the Medial Entorhinal Cortex (MEC) encode information about position (grid cells ) and movement (head direction cells and conjunctive position-by-direction cells ), this area is currently viewed as the most promising candidate for the place in the brain where path integration occurs.

Neurophysiology
Motion sensing using vision is crucial for detecting a potential mate, prey, or predator, and thus it is found both in vertebrates and invertebrates vision throughout a wide variety of species, although it is not universally found in all species. In vertebrates, the process takes place in retina and more specifically in retinal ganglion cells, which are neurons that receive input from bipolar cells and amacrine cells on visual information and process output to higher regions of the brain including, thalamus, hypothalamus, and mesencephalon.

The study of directionally selective units began with a discovery of such cells in the cerebral cortex of cats by David Hubel and Torsten Wiesel in 1959. Following the initial report, an attempt to understand the mechanism of directionally selective cells was pursued by Horace B. Barlow and William R. Levick in 1965. Their in-depth experiments in rabbit's retina expanded the anatomical and physiological understanding of the vertebrate visual system and ignited the interest in the field. Numerous studies that followed thereafter have unveiled the mechanism of motion sensing in vision for the most part. Alexander Borst and Thomas Euler's 2011 review paper, "Seeing Things in Motion: Models, Circuits and Mechanisms". discusses certain important findings from the early discoveries to the recent work on the subject, coming to the conclusion of the current status of the knowledge.

Direction selective (DS) cells
Direction selective (DS) cells in the retina are defined as neurons that respond differentially to the direction of a visual stimulus. According to Barlow and Levick (1965), the term is used to describe a group of neurons that "gives a vigorous discharge of impulses when a stimulus object is moved through its receptive field in one direction." This direction in which a set of neurons respond most strongly to is their "preferred direction". In contrast, they do not respond at all to the opposite direction, "null direction". The preferred direction is not dependent on the stimulus—that is, regardless of the stimulus' size, shape, or color, the neurons respond when it is moving in their preferred direction, and do not respond if it is moving in the null direction. There are three known types of DS cells in the vertebrate retina of the mouse, ON/OFF DS ganglion cells, ON DS ganglion cells, and OFF DS ganglion cells. Each has a distinctive physiology and anatomy. Analogous directionally selective cells are not thought to exist in the primate retina.

ON/OFF DS ganglion cells
ON/OFF DS ganglion cells act as local motion detectors. They fire at the onset and offset of a stimulus (a light source). If a stimulus is moving in the direction of the cell's preference, it will fire at the leading and the trailing edge. Their firing pattern is time-dependent and is supported by the Reichardt-Hassenstain model, which detects spatiotemporal correlation between the two adjacent points. The detailed explanation of the Reichardt-Hassenstain model will be provided later in the section. The anatomy of ON/OFF cells is such that the dendrites extend to two sublaminae of the inner plexiform layer and make synapses with bipolar and amacrine cells. They have four subtypes, each with its own preference for direction.

ON DS ganglion cells
Unlike ON/OFF DS ganglion cells that respond both to the leading and the trailing edge of a stimulus, ON DS ganglion cells are responsive only to a leading edge. The dendrites of ON DS ganglion cells are monostratified and extend into the inner sublamina of the inner plexiform layer. They have three subtypes with different directional preferences.

OFF DS ganglion cells
OFF DS ganglion cells act as a centripetal motion detector, and they respond only to the trailing edge of a stimulus. They are tuned to upward motion of a stimulus. The dendrites are asymmetrical and arbor in to the direction of their preference.

DS cells in insects
The first DS cells in invertebrates were found in flies in a brain structure called the lobula plate. The lobula plate is one of the three stacks of the neuropils in the fly's optic lobe. The "tangential cells" of the lobula plate composed of roughly about 50 neurons, and they arborize extensively in the neuropile. The tangential cells are known to be directionally selective with distinctive directional preference. One of which is Horizontally Sensitive (HS) cells, such as the H1 neuron, that depolarize most strongly in response to stimulus moving in a horizontal direction (preferred direction). On the other hand, they hyperpolarize when the direction of motion is opposite (null direction). Vertically Sensitive (VS) cells are another group of cells that are most sensitive to vertical motion. They depolarize when a stimulus is moving downward and hyperpolarize when it is moving upward. Both HS and VS cells respond with a fixed preferred direction and a null direction regardless of the color or contrast of the background or the stimulus.

The Hassenstein-Reichardt model
It is now known that motion detection in vision is based on the Hassenstein-Reichardt detector model. This is a model used to detect correlation between the two adjacent points. It consists of two symmetrical subunits. Both subunits have a receptor that can be stimulated by an input (light in the case of visual system). In each subunit, when an input is received, a signal is sent to the other subunit. At the same time, the signal is delayed in time within the subunit, and after the temporal filter, is then multiplied by the signal received from the other subunit. Thus, within each subunit, the two brightness values, one received directly from its receptor with a time delay and the other received from the adjacent receptor, are multiplied. The multiplied values from the two subunits are then subtracted to produce an output. The direction of selectivity or preferred direction is determined by whether the difference is positive or negative. The direction which produces a positive outcome is the preferred direction.

In order to confirm that the Reichardt-Hassenstein model accurately describes the directional selectivity in the retina, the study was conducted using optical recordings of free cytosolic calcium levels after loading a fluorescent indicator dye into the fly tangential cells. The fly was presented uniformly moving gratings while the calcium concentration in the dendritic tips of the tangential cells was measured. The tangential cells showed modulations that matched the temporal frequency of the gratings, and the velocity of the moving gratings at which the neurons respond most strongly showed a close dependency on the pattern wavelength. This confirmed the accuracy of the model both at the cellular and the behavioral level.

Although the details of the Hassenstein-Reichardt model have not been confirmed at an anatomical and physiological level, the site of subtraction in the model is now being localized to the tangential cells. When depolarizing current is injected into the tangential cell while presenting a visual stimulus, the response to the preferred direction of motion decreased, and the response to the null direction increased. The opposite was observed with hyperpolarizing current. The T4 and T5 cells, which have been selected as a strong candidate for providing input to the tangential cells, have four subtypes that each project into one of the four strata of the lobula plate that differ in the preferred orientation.

DS cells in vertebrates
One of the early works on DS cells in vertebrates was done on the rabbit retina by H. Barlow and W. Levick in 1965. Their experimental methods include variations to the slit-experiments and recording of the action potentials in the rabbit retina. The basic set-up of the slit experiment was they presented a moving black-white grating through a slit of various widths to a rabbit and recorded the action potentials in the retina. This early study had a large impact on the study of DS cells by laying down the foundation for later studies. The study showed that DS ganglion cells derive their property from the basis of sequence-discriminating activity of subunits, and that this activity may be the result of inhibitory mechanism in response to the motion of image in the null direction. It also showed that the DS property of retinal ganglion cells is distributed over the entire receptive field, and not limited to specific zones. Direction selectivity is contained for two adjacent points in the receptive field separated by as small as 1/4°, but selectivity decreased with larger separations. They used this to support their hypothesis that discrimination of sequences gives rise to direction selectivity because normal movement would activate adjacent points in a succession.

Molecular identity and structure of DS cells in mice
ON/OFF DS ganglion cells can be divided into 4 subtypes differing in their directional preference, ventral, dorsal, nasal, or temporal. The cells of different subtypes also differ in their dendritic structure and synaptic targets in the brain. The neurons that were identified to prefer ventral motion were also found to have dendritic projections in the ventral direction. Also, the neurons that prefer nasal motion had asymmetric dendritic extensions in the nasal direction. Thus, a strong association between the structural and functional asymmetry in ventral and nasal direction was observed. With a distinct property and preference for each subtype, there was an expectation that they could be selectively labeled by molecular markers. The neurons that were preferentially responsive to vertical motion were indeed shown to be selectively expressed by a specific molecular marker. However, molecular markers for other three subtypes have not been yet found.

Neural mechanism: starburst amacrine cells
The direction selective (DS) ganglion cells receive inputs from bipolar cells and starburst amacrine cells. The DS ganglion cells respond to their preferred direction with a large excitatory postsynaptic potential followed by a small inhibitory response. On the other hand, they respond to their null direction with a simultaneous small excitatory postsynaptic potential and a large inhibitory postsynaptic potential. Starburst amacrine cells have been viewed as a strong candidate for direction selectivity in ganglion cells because they can release both GABA and Ach. Their dendrites branch out radiantly from a soma, and there is a significant dendritic overlap. Optical measurements of Ca2+ concentration showed that they respond strongly to the centrifugal motion (the outward motion from the soma to the dendrites), while they don't respond well to the centripetal motion (the inward motion from the dendritic tips to the soma). When the starburst cells were ablated with toxins, direction selectivity was eliminated. Moreover, their release of neurotransmitters itself, specifically calcium ions, reflect direction selectivity, which may be presumably attributed to the synaptic pattern. The branching pattern is organized such that certain presynaptic input will have more influence on a given dendrite than others, creating a polarity in excitation and inhibition. Further evidence suggests that starburst cells release inhibitory neurotransmitters, GABA onto each other in a delayed and prolonged manner. This accounts for the temporal property of inhibition.

In addition to spatial offset due to GABAergic synapses, the important role of chloride transporters has started to be discussed. The popular hypothesis is that starburst amacrine cells differentially express chloride transporters along the dendrites. Given this assumption, some areas along the dendrite will have a positive chloride-ion equilibrium potential relative to the resting potential while others have a negative equilibrium potential. This means that GABA at one area will be depolarizing and at another area hyperpolarizing, accounting for the spatial offset present between excitation and inhibition.

Recent research (published March 2011) relying on serial block-face electron microscopy (SBEM) has led to identification of the circuitry that influences directional selectivity. This new technique provides detailed images of calcium flow and anatomy of dendrites of both starburst amacrine (SAC) and DS ganglion cells. By comparing the preferred directions of ganglion cells with their synapses on SAC's, Briggman et al. provide evidence for a mechanism primarily based on inhibitory signals from SAC's based on an oversampled serial block-face scanning electron microscopy study of one sampled retina, that retinal ganglion cells may receive asymmetrical inhibitory inputs directly from starburst amacrine cells, and therefore computation of directional selectivity also occurs postsynaptically. Such postsynaptic models are unparsimonious, and so if any given starburst amacrine cells conveys motion information to retinal ganglion cells then any computing of 'local' direction selectivity postsynaptically by retinal ganglion cells  is  redundant and dysfunctional. An acetylcholine (ACh) transmission model of directionally selective starburst amacrine cells provides a robust topological underpinning of a motion sensing in the retina.

Labs specialising in motion research

 * Visual Neuroscience, University of Nottingham
 * McGill Vision Research, McGill University
 * Purves Lab, Duke University
 * Center for Vision Research, York University
 * iLab, University of Southern California
 * Vision and Cognition, University of Tübingen