Auditory spatial attention

Auditory spatial attention is a specific form of attention, involving the focusing of auditory perception to a location in space.

Although the properties of visuospatial attention have been the subject of detailed study, relatively less work has been done to elucidate the mechanisms of audiospatial attention. Spence and Driver note that while early researchers investigating auditory spatial attention failed to find the types of effects seen in other modalities such as vision, these null effects may be due to the adaptation of visual paradigms to the auditory domain, which has decreased spatial acuity.

Recent neuroimaging research has provided insight into the processes behind audiospatial attention, suggesting functional overlap with portions of the brain previously shown to be responsible for visual attention.

Behavioral evidence
Several studies have explored the properties of visuospatial attention using the behavioral tools of cognitive science, either in isolation or as part of a larger neuroimaging study.

Rhodes sought to identify whether audiospatial attention was represented analogically, that is, if the mental representation of auditory space was arranged in the same fashion as physical space. If this is the case, then the time to move the focus of auditory attention should be related to the distance to be moved in physical space. Rhodes notes that previous work by Posner, among others, had not found behavioral differences in an auditory attention task that merely requires stimulus detection, possibly due to low-level auditory receptors being mapped tonotopically rather than spatially, as in vision. For this reason, Rhodes utilized an auditory localization task, finding that the time to shift attention increases with greater angular separation between attention and target, although this effect reached asymptote at locations more than of 90° from the forward direction.

Spence and Driver, noting that previous findings of audiospatial attentional effects including the aforementioned study by Rhodes could be confounded with response-priming, instead utilized several cuing paradigms, both exogenous and endogenous, over the course of 8 experiments. Both endogenous (informative) and exogenous (un-informative) cues increased performance in an auditory spatial localization task, consistent with the results previously found by Rhodes. However, only endogenous spatial cues improved performance on an auditory pitch discrimination task; exogenous spatial cues had no effect on the performance of this non-spatial pitch judgement. In light of these findings, Spence and Driver suggest that exogenous and endogenous audiospatial orientating may involve different mechanisms, with the colliculus possibly playing a role in both auditory and visual exogenous orienting, and the frontal and parietal cortex playing a similar part for endogenous orienting. It is noted that the lack of orientation effects to pitch stimuli for exogenous spatial cuing may be due to the connectivity of these structures, Spence and Driver note that while frontal and parietal cortical areas have inputs from cells coding both pitch and sound location, colliculus is only thought to be sensitive to pitches above 10 kHz, well above the ~350 Hz tones used in their study.

Diaconescu et al. found participants of their cross-modal cuing experiment to respond faster to the spatial (location of visual or auditory stimulus) rather than non-spatial (shape / pitch) properties of target stimuli. While this occurred for both visual and auditory targets, the effect was greater for targets in the visual domain, which the researchers suggest may reflect a subordination of the audiospatial to visuospatial attentional systems.

Neural basis
Neuroimaging tools of modern cognitive neuroscience such as functional magnetic resonance imaging (fMRI) and event-related potential (ERP) techniques have provided further insight beyond behavioral research into the functional form of audiospatial attention. Current research suggests that auditory spatial attention overlaps functionally with many areas previously shown to be associated with visual attention.

Although there exists substantial neuroimaging research on attention in the visual domain, comparatively fewer studies have investigated attentional processes in the auditory domain. For audition research utilizing fMRI, extra steps must be taken to reduce and/or avoid scanner noise impinging on auditory stimuli. Often, a sparse temporal sampling scanning pattern is used to reduce the impact of scanner noise, taking advantage of the haemodynamic delay and scanning only after stimuli have been presented.

What and where pathways in audition
Analogous to the 'what' (ventral) and 'where' (dorsal) streams of visual processing (see the Two Streams hypothesis,) there is evidence to suggest that audition is also split into identification and localization pathways.

Alain et al. utilized a delayed match to sample task in which participants held an initial tone in memory, comparing it to a second tone presented 500 ms later. Although the set of stimuli tones remained the same throughout the experiment, task blocks alternated between pitch and spatial comparisons. For example, during pitch comparison blocks, participants were instructed to report whether the second stimulus was higher, lower, or equal in pitch relative to the first pitch, regardless of the two tones spatial locations. Conversely, during spatial comparison blocks, participants were instructed to report whether the second tone was leftward, rightward, or equal in space relative to the first tone, regardless of tone pitch. This task was used in two experiments, one utilizing fMRI and one ERP, to gauge the spatial and temporal properties, respectively, of 'what' and 'where' auditory processing. Comparing the pitch and spatial judgements revealed increased activation in primary auditory cortices and right inferior frontal gyrus during the pitch task, and increased activation in bilateral posterior temporal areas, and inferior and superior parietal cortices during the spatial task. The ERP results revealed divergence between the pitch and spatial tasks at 300-500 ms following the onset of the first stimulus, in the form of increased positivity in inferior frontotemporal regions with the pitch task, and increased positivity over centroparietal regions during the spatial task. This suggested that, similar to what is thought to occur in vision, elements of an auditory scene are split into separate 'what' (ventral) and 'where' (dorsal) pathways, however it was unclear if this similarity is the result of a supramodal division of feature and spatial processes.

Further evidence as to the modality specificity of the 'what' and 'where' pathways has been provided in a recent study by Diaconescu et al., who suggest that while 'what' processes have discrete pathways for vision and audition, the 'where' pathway may be supra-modal, shared by both modalities. Participants were asked in randomly alternating trials to respond to either the feature or spatial elements of stimuli, which varied between the auditory and visual domain in set blocks. Between two experiments, the modality of the cue was also varied; the first experiment contained auditory cues as to which element (feature or spatial) of the stimuli to respond to, while the second experiment utilized visual cues. During the period between cue and target, when participants were presumably attending to the cued feature to be presented, both auditory and vision spatial attention conditions elicited greater positivity in source space from a centro-medial location at 600-1200 ms following cue onset, which the authors of the study propose may be the result of a supra-modal pathway for spatial information. Conversely, source space activity for feature attention were not consistent between modalities, with auditory feature attention associated with greater positivity at the right auditory radial dipole around 300-600 ms, and spatial feature attention associated with greater negativity at the left-visual central-inferior dipole at 700-1050ms, suggested as evidence for separate feature or 'what' pathways for vision and audition.

Audiospatial attentional network
Several studies investigating the functional structures of audiospatial attention have revealed functional areas which overlap with visuospatial attention, suggesting the existence of a supra-modal spatial attentional network.

Smith et al. contrasted the cortical activation during audiospatial attention with both visuospatial attention and auditory feature attention in two separate experiments.

The first experiment used an endogenous or top down orthogonal cuing paradigm to investigate the cortical regions involved in audiospatial attention vs. visuospatial attention. The orthogonal cuing paradigm refers to the information provided by the cue stimuli; participants were asked to make a spatial up/down elevation judgement to stimuli that can appear either centrally, or laterally to the left / right side. While cues provided information to the lateralization of the target to be presented, they contained no information as to the correct elevation judgement. Such a procedure was used to dissociate the functional effects of spatial attention from those of motor-response priming. The same task was used for visual and auditory targets, in alternating blocks. Crucially, the primary focus of analysis was on “catch trials,” in which cued targets are not presented. This allowed for investigation of functional activation related to attending to a specific location, free of contamination from target-stimulus related activity. In the auditory domain, comparing activation following peripheral right and left cues to central cues revealed significant activation in the posterior parietal cortex (PPC,) frontal eye fields (FEF), and supplementary motor area (SMA.) These areas overlap those that were significantly active during the visuospatial attention condition; a comparison of the activation during the auditory and visual spatial attention conditions found no significant difference between the two.

During the second experiment participants were presented with a pair of distinguishable auditory stimuli. Although the pair of stimuli were identical throughout the experiment, different blocks of the task required participants to respond to either the temporal order (which sound came first) or spatial location (which sound was farther from midline) of the stimuli. Participants were instructed which feature to attend to at the onset of each block, allowing for comparisons of activation due to auditory spatial attention and auditory non-spatial attention to the same set of stimuli. The comparison of the spatial location task to the temporal order task showed greater activation in areas previously found to be associated with attention in the visual domain, including the bilateral temporal parietal junction, bilateral superior frontal areas near FEF, bilateral intraparietal sulcus, and bilateral occipital temporal junction, suggesting an attentional network that operates supra-modally across vision and audition.

Executive Control
The anatomical locus of the executive control of endogenous audiospatial attention was investigated using fMRI by Wu et al.. Participants received auditory cues to attend to either their left or right, in anticipation of an auditory stimulus. A third cue, instructing participants to attend to neither left nor right, served as a control, non-spatial condition. Comparing activation in the spatial vs. non-spatial attentional conditions showed increased activation in several areas implicated in the executive control of visual attention, including the prefrontal cortex, FEF, anterior cingulate cortex (ACC), and superior parietal lobe, again supporting the notion of these structures as supra-modal attentional areas. The spatial attention vs. control comparison further revealed increased activity in auditory cortex, increases which were contralateral to the side of audiospatial attention, which may reflect top-down biasing of early sensory areas as has been seen with visual attention.

Wu et al. additionally observed that audiospatial attention was associated with increased activation in areas thought to process visual information, namely the cuneus and lingual gyrus, despite participants having completed the task with eyes closed. As this activity was not contralateral to the locus of attention the authors contend that the effect is likely not spatially specific, suggesting that it may instead reflect a general spread of attentional activity, possibly playing a role in multimodal sensory integration.

Future directions
Although comparatively less research exists on the functional underpinnings of audiospatial compared to visuospatial attention, it is currently suggested that many of the anatomical structures implicated in visiospatial attention function supramodally, and are involved with audiospatial attention as well. The cognitive consequences of this connection, which may relate to multimodal processing, have yet to be fully explored.