Musical syntax

When analysing the regularities and structure of music as well as the processing of music in the brain, certain findings lead to the question of whether music is based on a syntax that could be compared with linguistic syntax. To get closer to this question it is necessary to have a look at the basic aspects of syntax in language, as language unquestionably presents a complex syntactical system. If music has a matchable syntax, noteworthy equivalents to basic aspects of linguistic syntax have to be found in musical structure. By implication the processing of music in comparison to language could also give information about the structure of music.

Comparison to linguistic syntax
Syntax in general can be referred to as a study of the principles and rules needed for the construction of a language or as a term in particular describing these principles and rules for a special language.

Linguistic syntax – three principles
Linguistic syntax is especially marked by its structural richness, which becomes apparent in its multi layered organization as well as in the strong relationship between syntax and meaning. That is that there are special linguistic syntactic principles that define how the language is formed out of different subunits, such as words out of morphemes, phrases out of words and sentences out of phrases. Furthermore, linguistic syntax is featured by the fact that a word can take on abstract grammatical functions that are less defined through properties of the word itself and more through the context and structural relations. This is for example that every noun can be used as a subject, object or indirect object, but without a sentence as the normal context of a word, no statement about its grammatical function can be made. At last, linguistic syntax is marked by abstractness. This means that only conventional structural relations and not psychoacoustic relationships are the basis for the linguistic syntax.

Musical syntax
Concerning musical syntax these three aspects of richness in linguistic syntax as well as the abstractness should be found in music too, if one wants to claim that music has a comparable syntax. An annotation that has to be made concerns the fact that most of the studies dealing with musical syntax are confined to the consideration of Western European tonal music. Thus this article can also only focus on Western tonal music.

Multilayered organization
Considering the multilayered organization of music, three levels of pitch organization can be found in music.

Scale degrees
The lowest level are musical scales, which consist of seven tones or "scale degrees" per octave and have an asymmetric pattern of intervals between them (for example the C-major scale). They are built up out of the 12 possible pitch classes per octave (A, A♯,B, C, C♯, D, D♯, E, F, F♯, G, G♯) and the different scale tones are not equal in their structural stability. Empirical evidence indicates that there is a hierarchy concerning the stability of the single tones. The most stable one is called the "tonic" and embodies the tonal centre of the scale. The most unstable tones were the ones closest to the tonic (scale degrees 2 and 7), which are called the "supertonic" and the "leading tone". In studies scale degrees 1, 3 and 5 have been judged as closely related. It was also shown that an implicit knowledge of scale structure has to be learned and developed in childhood and is not inborn.

Chord structure
The next superordinate level of pitch organization is the chord structure, which means that three scale tones with a distance of two scale steps each are played simultaneously and are therefore combined into chords. When building up chords on the basis of a musical scale there are three different kinds of chords resulting, namely "major"(e.g. C-E-G), "minor" (e.g. D-F-A) and "diminished" (e.g. B-D-F) triads. This is due to the asymmetric intervals between the scale tones. These asymmetric intervals effect, that a distance of two scale steps can comprise either three or four semitones and therefore be an interval of a minor (with three semitones) or a major (with four semitones) third. A major triad consists of a major third followed by a minor third and is built on scale degrees 1, 3 and 5 (or 4, 6 and 1, for the subdominant, and 5, 7 and 2, for the dominant, the other two major triads that can be formed from the major scale). A minor triad consists of a minor third followed by a major third and is built on scale degrees 2, 4 and 6 (or 3, 5 and 7, for the mediant, and 6, 1 and 4, for the submediant). Only on scale degree 7 the triad consists of two minor thirds and is therefore defined as a diminished triad. Chordal syntax touches mainly four basic aspects. The first is, that the lowest note in each triad functions as a fundament of the chord and therefore as the structural most important pitch. The chord is named after this note as well as the chord's harmonic label is grounded on it. The second aspect is, that chord syntax provides norms for altering chords by additional tones. One example is the addition of a fourth tone to a triad, which is the seventh tone of the scale (e.g. in a C-major scale the addition of F to the triad G-B-D would lead to a so-called "dominant seventh chord"). Concerning norms for the progression of chords in time the third aspect focuses on the relationship between chords. The patterning of chords in a cadence for example indicates a movement from a V chord to a I chord. The fact that the I chord is perceived as a resting point in a musical phrase implicates, that the single chords built up on notes of a scale are not equal in there stability but show the same differences in stability as the notes of the scale do. This describes the fourth basic aspect of chordal syntax. The tonic chord (the one built on the tonic, C-E-G in C-major, for example) is the most stable and central chord, followed by the dominant chord (built on the 5th scale degree) and the subdominant chord (built on the 4th scale degree). "

Key structure
The highest level of pitch organization can be seen in key structure. In Western European tonal music the key is based on a scale with its associated chords and chord relations. Scales can be built up as minor or major scales (differing in the succession of intervals between the scale tones) on each of the 12 pitch classes and therefore there are 24 possible keys in tonal music. Analysing key structure in context of musical syntax means to examine the relationship between keys in a piece of music. Usually, not only one key is used to build up a composition, but also so-called key "modulations" (in other words the alteration of keys) are utilized. In these modulations a certain recurring pattern can be perceived. Switches from one key to another are often found between related keys. Three general principles for relationship between keys can be postulated on the basis of perceptual experiments and also neural evidence for implicit knowledge of key structure. Looking at the C-major key as an example, there are three close related keys: G-major, A-minor and C-minor. C-major and G-major are keys whose 1st scale degrees are separated by a musical fifth (the pattern of relations is represented in the circle of fifths" for major keys). A-minor and C-major share the same notes of the scale but with a different tonic (so-called relative minor key, i.e. C-major and A-minor). And C-major and C-minor have the same tonic in their scales. All in all it can be said that music like the human language has a considerable multi layered organization.

Hierarchical structure
Considering the last two basic aspects of linguistic syntax, namely the considerable significance of the order of subunits for the meaning of a sentence as well as the fact that words undertake abstract grammatical functions defined through context and structural relations, it seems to be useful to analyse the hierarchical structure of music to find correlations in music.

Ornamentation
One aspect of hierarchical structure of music is the ornamentation. The meaning of the word "ornamentation" points to the fact that there are events in a musical context that are less important to form an idea of the general gist of a sequence than others. The decision on the importance of events not only comprises harmonic considerations, but also rhythmic and motivic information. But a classification of events simply into ornamental and structural events would be too superficial. In fact the most common hypothesis implies, that music is organized into structural levels, which can be pictured as branches of a tree. A pitch that is structural at a higher level may be ornamental at a deeper level. This can be compared with the hierarchical syntactic structure of a sentence in which there are structural elements that are necessary to build up a sentence like the noun phrase and the verb phrase but looking at a deeper level the structural elements also contain additional or ornamental constituents.

Tension and resolution
Searching for other aspects of hierarchical structure of music there is a controversial discussion, if the organization of tension and resolution in music can be described as hierarchical structure or only as a purely sequential structure. According to Patel research in this area has produced apparently contradictory evidence, and more research is needed to answer this question. The question concerning the kind of structure that features tension and resolution in music is linked very close to the relationship between order and meaning in music. Considering tension and resolution as one possible kind of meaning in music a hierarchical structure would imply that a change of order of musical elements would have an influence on the meaning of the music.

Abstractness
The last aspect to examine is the abstractness of linguistic syntax and its correlate in music. There are two contradicting points of views. The first one claims that the foundation for musical scales and for the existence of a tonal centre in music can be seen in the physical basis of overtone series or in the psychoacoustic properties of chord in tonal music respectively. But in recent time there is strong evidence for the second point of view that syntax reflects abstract cognitive relationships.

All in all the consideration of syntax in music and language shows, that music has a syntax comparable to the linguistic syntax especially concerning a great complexity and a hierarchical organization. Nevertheless, it has to be emphasized, that musical syntax is not a simple variant of linguistic syntax, but a similar complex system with its own substance. That means that it would be the wrong way just to search for musical analogies of linguistic syntactic entities such as nouns or verbs.

Neuronal processing of musical and linguistic syntax
Investigating the neuronal processing of musical syntax can serve two proposed aspects.  The first is to learn more about the processing of music in general. That is, which areas of the brain are involved and if there are specific markers of brain activity due to the processing of music and musical syntax. The second aspect is to compare the processing of musical and linguistic syntax to find out, if they have an effect upon each other or if there even is a significant overlap. The verification of an overlap would support the thesis, that syntactic operations (musical as well as linguistic) are modular. "Modular" means, that the complex system of processing is decomposed into subsystems with modular functions. Concerning the processing of syntax this would mean, that the domain of music and language each have specific syntactic representations, but that they share neural resources for activating and integrating these representations during syntactic processing.

Requirements
Processing of music and musical syntax comprises several aspects concerning melodic, rhythmic, metric, timbral and harmonic structure. For the processing of chord functions four steps in processing can be described. (1)Primarily, a tonal centre has to be detected out of the first chords of a sequence. Often the first chord is interpreted as the tonal centre of a sequence and a reevaluation is necessary, if the first chord has another harmonic function. (2)Successive chords are related to this tonal centre concerning their harmonic distance from the tonal centre. (3)As described above (Does music have a syntax?), music has a hierarchical structure in terms of pitch organization and organization of tensioning and releasing in music. Pitch organization concerning chords means, that in a musical phrase the tonic is the most stable chord and experienced as the resting point. The dominant and subdominant anon are more stable than the submediant and the supertonic. The progression of chords in time forms a tonal structure based on pitch organization, in which moving away from the tonic is perceived as tensioning and moving towards the tonic is experienced as releasing. Therefore, hierarchical relations may convey organized patterns of meaning. (4)Concerning harmonic aspects of major-minor tonal music, Musical syntax can be characterized by statistical regularities in the succession of chord functions in time, that is probabilities of chord transitions. As these regularities are stored in a long-term memory, predictions about following chords are made automatically, when listening to a musical phrase.

====MMN and ERAN ====

The violation of these automatically made predictions lead to the observation of so-called ERPs (event related potential, a stereotyped electrophysiological response to an internal or external stimulus). Two forms of ERPs can be detected in the context of processing music. One is the MMN (mismatch negativity), which has first been investigated only with physical deviants like frequency, sound intensity, timbre deviants (referred to as phMMN) and could now also be shown for changes of abstract auditory features like tone pitches (referred to as afMMN). The other one is the so-called ERAN (early right anterior negativity), which can be elicited by syntactic irregularities in music. Both the ERAN and the MMN are ERPs indicating a mismatch between predictions based on regularities and actually experienced acoustic information. As for a long time it seemed to be, that the ERAN is a special variant of the MMN, the question arises, why they are told apart today. There are several differences between the MMN and the ERAN found in the last years:

Differences – occurrence
Even though music syntactic regularities are often simultaneously acoustical similar and music syntactic irregularities are often simultaneously acoustical different, an ERAN but not an MMN can be elicit, when a chord does not represent a physical but a syntactic deviance. To demonstrate this, so-called "Neapolitan sixth chords" are used. These are consonant chords when played solitary, but which are added into a musical phrase of in which they are only distantly related to the harmonic context. Added into a chord sequence of five chords, the addition of a Neapolitan sixth chord at the third or at the fifth position evokes different amplitudes of ERANs in the EEG with a higher amplitude at the fifth position. Nevertheless, when creating a chord sequence in which the Neapolitan chord at the fifth position is music-syntactically less irregular than a Neapolitan chord at the third position, the amplitude is higher at the third position (see figure 4...). In opposition to the MMN, a clear ERAN is also elicited by using syntactically irregular chords, which are acoustically more similar to a proceeding harmonic context than syntactically regular chords. Therefore, the MMN seems to be based on an on-line establishment of regularities. That means, that the regularities are extracted on-line from the acoustic environment. In opposition, the ERAN rests upon representations of music-syntactic regularities which exist in a long-term memory format and which are learned during early childhood.

Differences – development
This is represented in the development of the ERAN and MMN. The ERAN cannot be verified in newborn babies, whereas the MMN can actually be demonstrated in fetus. In two-year-old children, the ERAN is very small, in five-year-old children a clear ERAN is found, but with a longer latency than in adults. With the age of 11 years children show an ERAN similar to ERANs in adults. Out of these observation the thesis can be built that the MMN is essential for the establishment and maintenance of representations of the acoustic environment and for processes of the auditory scene analysis. But only the ERAN is completely based on learning to build up a structural model, which is established with reference to representations of syntactic regularities already existing in a long-term memory format. Considering effects of training both the ERAN and the MMN can be modulated by training.

Differences – neural sources
Differences between the ERAN and the MMN also exist in the neural sources for the main contributions to the ERPs. The sources for the ERAN are located in the pars opercularis of the inferior fronto-lateral cortex (inferior Brodmann's area with contributions from the ventrolateral premotor cortex and the anterior superior temporal gyrus, whereas the MMN receives its main contributions from and within the vicinity of the primary auditory cortex with additional sources in the frontal cortical areas. Therefore, the sources for the ERAN basically lie in the frontal cortex whereas the sources for the MMN are located in the temporal lobe. Other hints for this thesis emerge from the fact that under a propofol sedation which mainly affects the frontal cortex, the ERAN is abolished while the MMN is only reduced. At last, the amplitude of the ERAN is reduced under ignore conditions whereas the MMN is largely unaffected by attentional modulations.

Processes to elicit the MMN or ERAN
(1)First, a separation of sound sources, an extraction of sound features and the establishment of representations of auditory objects of the incoming acoustic input have to be made. The same processes are required for the MMN and ERAN.

(2)For the MMN regularities are filtered on-line out of the input to create a model of the acoustic environment. At this point, there is a difference to the ERAN as for the ERAN representations of regularities already exist in a long-term memory format and the incoming sound is integrated into a pre existent model of musical structure.

(3)According to the model of musical structure, predictions concerning forthcoming auditory events are formed. This process is similar for the ERAN and for the MMN.

(4)At least a comparison between the actually incoming sound and the predictions based on the model is made. This process is partly the same for the MMN and the ERAN as well.

Comparison of the processing of musical and linguistic syntax
As the ERAN is similar to an ERP called ELAN which can be elicited by violation of linguistic syntax it seems to be obvious that the ERAN really represents syntactic processing. Deduced from this thought an interaction between music-syntactic and language-syntactic processing would be very likely.There are different possibilities in neuroscience to approach to an answer to the question of an overlap between the neuronal processing of linguistic and musical syntax.

Neuropsychological approach
This method deals with the question, how structure and function of the brain relate to outcomes in behaviour and other psychological processes. From this area of research there has been evidence for the dissociation between musical and linguistic syntactic abilities. In case reports it was possible to show that amusia ( a deficiency in fine-grainded perception of pitch which leads to musical tone-deafness and can be congenital or acquired later in life as from brain damage) is not necessarily linked to aphasia (severe language impairments following brain damage) and vice versa. This means that individuals with normal speech and language abilities showed musical tone-deafness as well as individuals with language impairments had sufficient means of musical syntactic abilities. The problem of neuropsychologic research is that there has not been a former case report which showed that aphasia does not necessarily entail amusia in non-musicians, to the contrary newer findings suggest that amusia is almost always linked to aphasia.

Neuroimaging
Furthermore, results from neuroimaging led to the "shared syntactic integration resource hypothesis" (SSIRH), which supports the presumption, that there is an overlap between the processing of musical and linguistic syntax and that syntactic operations are modular. Furthermore, research using the method of electroencephalography has shown that a difficulty or irritation in musical as well as in linguistic syntax elicit ERPs which are similar to each other.

How can the discrepancy between neuropsychology and neuroimaging be explained?

Modularity
In fact, the concept of modularity itself can help to understand the different and apparently contradicting findings in neuropsychologic research and neuroimaging. Introducing the concept of a dual system, in which there is a distinction between syntactic representation and syntactic processing, this could mean, that there is a distinction between long-term structural knowledge in a domain (representation) and operations conducted on that knowledge (syntactic processing). A damage in an area representing long-term musical knowledge would lead to amusia without aphasia, but a damage in an area representing syntactic processing would cause an impairment of both musical and linguistic syntactic processing.

Comparison of syntactic processing-three theories
The comparison of the syntactic processing of language and music is based on three theories which should be mentioned but which are not explained in detail. The first two, the "dependency locality theory" and the "expectancy theory" refer to syntactic processing in language, whereas the third one, the "tonal pitch space theory", relates to the syntactic processing in music.

The language theories contribute to the concept that in order to conceive the structure of a sentence, resources are consumed. If the conception of a this structure is difficult due to the fact that distant words belong to each other or an expected structure of the sentence is violated, more resources, namely the ones for activating low-activation items, are consumed.

Violating an anticipated structure in music could mean a harmonically unexpected note or chord in a musical sequence. As in language this is associated with a "processing cost due to the tonal distance" (Patel, 2008) and therefore means that more resources are needed for activating low-activation items.

SSIRH – the leading concept
Overall these theories lead to the "shared syntactic integration resources hypothesis" as the areas from which low-activation items are activated could be the correlate to the overlap between linguistic and musical syntax. Strong evidence for the existence of this overlap comes from studies, in which music-syntactic and a linguistic-syntactic irregularities were presented simultaneously. They showed an interaction between the ERAN and the LAN (left anterior negativity;ERP which is elicited by linguistic-syntactic irregularities). The LAN elicited was reduced when an irregular word was presented simultaneously with an irregular chord compared to the condition when an irregular word was presented with a regular chord. Contrary to this finding the phMMN elicited by frequency deviants did not interact with the LAN.

From this facts it can be reasoned that the ERAN relies on neural resources related to syntactic processing (Koelsch 2008). Furthermore, they give strong evidence for the thesis, that there is an overlap between the processing of musical and linguistic syntax and therefore that syntactic operations (musical as well as linguistic) are modular.