Generative theory of tonal music

The generative theory of tonal music (GTTM) is a system of music analysis developed by music theorist Fred Lerdahl and linguist Ray Jackendoff. First presented in their 1983 book of the same title, it constitutes a "formal description of the musical intuitions of a listener who is experienced in a musical idiom" with the aim of illuminating the unique human capacity for musical understanding.

The musical collaboration between Lerdahl and Jackendoff was inspired by Leonard Bernstein's 1973 Charles Eliot Norton Lectures at Harvard University, wherein he called for researchers to uncover a musical grammar that could explain the human musical mind in a scientific manner comparable to Noam Chomsky's revolutionary transformational or generative grammar.

Unlike the major methodologies of music analysis that preceded it, GTTM construes the mental procedures under which the listener constructs an unconscious understanding of music, and uses these tools to illuminate the structure of individual compositions. The theory has been influential, spurring further work by its authors and other researchers in the fields of music theory, music cognition and cognitive musicology.

Theory
GTTM focuses on four hierarchical systems that shape our musical intuitions. Each of these systems is expressed in a strict hierarchical structure where dominant regions contain smaller subordinate elements and equal elements exist contiguously within a particular and explicit hierarchical level. In GTTM any level can be small-scale or large-scale depending on the size of its elements.

I. Grouping structure
GTTM considers grouping analysis to be the most basic component of musical understanding. It expresses a hierarchical segmentation of a piece into motives, phrases, periods, and still larger sections.

II. Metrical structure
Metrical structure expresses the intuition that the events of a piece are related to a regular alternation of strong and weak beats at a number of hierarchical levels. It is a crucial basis for all the structures and reductions of GTTM.

III. Time-span reduction
Time-span reductions (TSRs) are based on information gleaned from metrical and grouping structures. They establish tree structure-style hierarchical organizations uniting time-spans at all temporal levels of a work. The TSR analysis begins at the smallest levels, where metrical structure marks off the music into beats of equal length (or more precisely into attack points separated by uniform time-spans ) and moves through all larger levels where grouping structure divides the music into motives, phrases, periods, theme groups, and still greater divisions. It further specifies a “head” (or most structurally important event) for each time-span at all hierarchical levels of the analysis. A completed TSR analysis is often called a time-span tree.

IV. Prolongational reduction
Prolongational reduction (PR) provides our "psychological" awareness of tensing and relaxing patterns in a given piece with precise structural terms. In time-span reduction, the hierarchy of less and more important events is established according to rhythmic stability. In prolongational reduction, hierarchy is concerned with relative stability expressed in terms of continuity and progression, the movement toward tension or relaxation, and the degree of closure or non-closure. A PR analysis also produces a tree-structure style hierarchical analysis, but this information is often conveyed in a visually condensed modified "slur" notation.

The need for prolongational reduction mainly arises from two limitations of time-span reductions. The first is that time-span reduction fails to express the sense of continuity produced by harmonic rhythm. The second is that time-span reduction—even though it establishes that particular pitch-events are heard in relation to a particular beat, within a particular group—fails to say anything about how music flows across these segments.

More on TSR vs PR
It is helpful to note some basic differences between a time-span tree produced by TSR and a prolongational tree produced by PR. First, though the basic branching divisions produced by the two trees are often the same or similar at high structural levels, branching variations between the two trees often occur as one travels further down towards the musical surface.

A second and equally important differentiation is that a prolongational tree carries three types of branching: strong prolongation (represented by an open node at the branching point), weak prolongation (a filled node at the branching point) and progression (simple branching, with no node). Time-span trees do not make this distinction. All time-span tree branches are simple branches without nodes (though time-span tree branches are often annotated with other helpful comments).

Rules
Each of the four major hierarchical organizations (grouping structure, metrical structure, time-span reduction and prolongational reduction) is established through rules, which are in three categories:


 * 1) The well-formedness rules, which specify possible structural descriptions.
 * 2) The preference rules, which draw on possible structural descriptions eliciting those descriptions that correspond to experienced listeners’ hearings of any particular piece.
 * 3) The transformational rules, which provide a means of associating distorted structures with well-formed descriptions.

Grouping well-formedness rules (G~WFRs)
"Any contiguous sequence of pitch-events, drum beats, or the like can constitute a group, and only contiguous sequences can constitute a group." "A piece constitutes a group." "A group may contain smaller groups." "If a group G1 contains part of a group G2, it must contain all of G2." 'If a group G1 contains a smaller group G2, then G1 must be exhaustively partitioned into smaller groups." 

Grouping preference rules (G~PRs)
alternative form: "Avoid analyses with very small groups – the smaller, the less preferable." (Proximity) Consider a sequence of four notes, n1–n4, the transition n2–n3 may be heard as a group boundary if:  (slur/rest) the interval of time from the end of n2 is greater than that from the end of n1 to the beginning of n2 and that from the end of n3 to the beginning of n4 or if </li> <li>(attack/point) the interval of time between the attack points of n2 and n3 is greater than between those of n1 and n2 and between those of n3 and n4.</li> </ol> <li>(Change) Consider a sequence of four notes, n1–n4. The transition n2–n3 may be heard as a group boundary if marked by</li> <li>(Register) the transition n2-n3 involves a greater intervallic distance than both n1-n2 and n3-n4, or if</li> <li>(Dynamics) the transition n2-n3 involves a change in dynamics and n1-n2 and n3-n4 do not, or if</li> <li>(Articulation) the transition n2-n3 involves a change in articulation and n1-n2 and n3-n4 do not, or if</li> <li>(Length) n2 and n3 are of different length and both pairs n1,n2 and n3,n4 do not differ in length.</li> </ol> <li>(Intensification) A larger-level group may be placed where the effects picked out by GPRs 2 and 3 are more pronounced.</li> <li>(Symmetry) "Prefer grouping analyses that most closely approach the ideal subdivision of groups into two parts of equal length."</li> <li>(Parallelism) "Where two or more segments of music can be construed as parallel, they preferably form parallel parts of groups."</li> <li>(Time-span and prolongational stability) "Prefer a grouping structure that results in more stable time-span and/or prolongational reductions."</li> </ol>

Transformational grouping rules
<ul> <li>Grouping overlap (p. 60)</li> Given a well-formed underlying grouping structure G as described by GWFRs 1-5, containing two adjacent groups g1 and g2 such that <ul> <li>g1 ends with event e1,</li> <li>g2 begins with event e2, and</li> <li>e1 = e2</li> </ul> a well-formed surface grouping structure G' may be formed that is identical to G except that <ul> <li>it contains one event e' where G had the sequence e1e2,</li> <li>e'=e1=e2</li> <li>all groups ending with e1 in G end with e' in G', and</li> <li>all groups beginning with e2 in G begin with e' in G'.</li> </ul> <li>Grouping elision (p. 61).</li> Given a well-formed underlying grouping structure G as described by GWFRs 1-5, containing two adjacent group g1 and g2 such that <ul> <li>g1 ends with event e1,</li> <li>g2 begins with event e2, and</li> <ul> <li>(for left elision) e1 is harmonically identical to e2 and less than e2 in dynamics and pitch range or</li> <li>(for right elision) e2 is harmonically identical to e1 and less than e1 in dynamics and pitch range,</li> </ul> </ul> a well-formed surface grouping structure G' may be formed that is identical to G except that <ul> <li>it contains one event e' where G had the sequence e1e2,</li> <ul> <li>(for left elision) e'=e2,</li> <li>(for right elision) e'=e1,</li> </ul> <li>all groups ending with e1 in G end with e' in G', and</li> <li>all groups beginning with e2 in G begin with e' in G'.</li> </ul> </ul>

Metrical well-formedness rules (M~WFRs)
<li>"Every attack point must be associated with a beat at the smallest metrical level present at that point in the piece."</li> <li>"Every beat at a given level must also be a beat at all smaller levels present at that point in that piece."</li> <li>"At each metrical level, strong beats are spaced either two or three beats apart."</li> <li>"The tactus and immediately larger metrical levels must consist of beats equally spaced throughout the piece. At subtactus metrical levels, weak beats must be equally spaced between the surrounding strong beats."</li> </ol>

Metrical preference rules (M~PRs)
<li>(Parallelism) "Where two or more groups or parts of groups can be construed as parallel, they preferably receive parallel metrical structure."</li> <li>(Strong beat early) "Weakly prefer a metrical structure in which the strongest beat in a group appears relatively early in the group."</li> <li>(Event) "Prefer a metrical structure in which beats of level Li that coincide with the inception of pitch-events are strong beats of Li."</li> <li>(Stress) "Prefer a metrical structure in which beats of level Li that are stressed are strong beats of Li."</li> <li>(Length) Prefer a metrical structure in which a relatively strong beat occurs at the inception of either </li> <li>a relatively long pitch-event;</li> <li>a relatively long duration of a dynamic;</li> <li>a relatively long slur;</li> <li>a relatively long pattern of articulation;</li> <li>a relatively long duration of a pitch in the relevant levels of the time-span reduction;</li> <li>a relatively long duration of a harmony in the relevant levels of the time-span reduction (harmonic rhythm).</li> </ol> <li>(Bass) "Prefer a metrically stable bass."</li> <li>(Cadence) "Strongly prefer a metrical structure in which cadences are metrically stable; that is, strongly avoid violations of local preference rules within cadences."</li> <li>(Suspension) "Strongly prefer a metrical structure in which a suspension is on a stronger beat than its resolution."</li> <li>(Time-span interaction) "Prefer a metrical analysis that minimizes conflict in the time-span reduction."</li> <li>(Binary regularity) "Prefer metrical structures in which at each level every other beat is strong." </ol>

Transformational metrical rule
<ul> <li>Metrical deletion (p. 101).</li> Given a well-formed metrical structure M in which <li>B1, B2 and B3 are adjacent beats of M at level L1, and B2 is also a beat at level Li+1,</li> <li>T1 is the time-span from B1 to B2 and T2 is the time-span from B2 to B3, and</li> <li>M is associated with and underlying grouping structure G in such a way that both T1 and T2 are related to a surface time-span T' by the grouping transformation performed on G of</li> <li>left elision or</li> <li>overlap,</li> </ol> </ol> then a well-formed metrical structure M' can be formed from M and associated with the surface grouping structure by   <li>deleting B1 and all beats at all levels between B1 and B2 and associating B2 with the onset of T', or</li> <li>deleting B2 and all beats at all levels between B2 and B3 and associating B1 with the onset of T'.</li> </ol> </ul>

III. Time-span reduction rules
Time-span reduction rules begin with two segmentation rules and proceed to the standard WFRs, PRs and TRs.

Time-span segmentation rules

 * 1) "Every group in a piece is a time-span in the time-span segmentation of the piece."
 * 2) "In underlying grouping structure: a. each beat B of the smallest metrical level determines a time-span TB extending from B up to but not including the next beat of the smallest level; b. each beat B of metrical level Li determines a regular time-span of all beats of level Li-1 from B up to but not including (i) the next beat B’ of level Li or (ii) a group boundary, whichever comes sooner; and c. if a group boundary G intervenes between B and the preceding beat of the same level, B determines an augmented time-span T’B, which is the interval from G to the end of the regular time-span TB."

Time-span reduction well-formedness rules (TSR~WFRs)

 * 1) "For every time-span T there is an event e (or a sequence of events e1 – e2) that is the head of T."
 * 2) "If T does not contain any other time-span (that is, if T is the smallest level of time-spans), there e is whatever event occurs in T."
 * 3) If T contains other time-spans, let T1,...,Tn be the (regular or augmented) time-spans immediately contained in T and let e1,...,en be their respective heads. Then the head is defined depending on: a. ordinary reduction; b. fusion; c. transformation; d. cadential retention (p. 159).
 * 4) "If a two-element cadence is directly subordinate to the head e of a time-span T, the final is directly subordinate to e and the penult is directly subordinate to the final."

Time-span reduction preference rules (TSR~PRs)

 * 1) (Metrical position) "Of the possible choices for head of time-span T, prefer that is in a relatively strong metrical position."
 * 2) (Local harmony) "Of the possible choices for head of time-span T, prefer that is: a. relatively intrinsically consonant, b. relatively closely related to the local tonic."
 * 3) (Registral extremes) "Of the possible choices for head of time-span T, weakly prefer a choice that has: a. a higher melodic pitch; b. a lower bass pitch."
 * 4) (Parallelism) "If two or more time-spans can be construed as motivically and/or rhythmically parallel, preferably assign them parallel heads."
 * 5) (Metrical stability) "In choosing the head of a time-span T, prefer a choice that results in more stable choice of metrical structure."
 * 6) (Prolongational stability) "In choosing the head of a time-span T, prefer a choice that results in more stable choice of prolongational structure."
 * 7) (Cadential retention) (p. 170).
 * 8) (Structural beginning) "If for a time-span T there is a larger group G containing T for which the head of T can function as the structural beginning, then prefer as head of T an event relatively close to the beginning of T (and hence to the beginning of G as well)."
 * 9) "In choosing the head of a piece, prefer the structural ending to the structural beginning."

Prolongational reduction well-formedness rules (PR~WFRs)

 * 1) "There is a single event in the underlying grouping structure of every piece that functions as prolongational head."
 * 2) "An event ei can be a direct elaboration of another pitch ej in any of the following ways: a. ei is a strong prolongation of ej if the roots, bass notes, and melodic notes of the two events are identical; b. ei is a weak prolongation of ej if the roots of the two events are identical but the bass and/or melodic notes differ; c. ei is a progression to or from ej if the harmonic roots of the two events are different."
 * 3) "Every event in the underlying grouping structure is either the prolongational head or a recursive elaboration of the prolongational head."
 * 4) (No crossing branches) "If an event ei is a direct elaboration of an event ej, every event between ei and ej must be a direct elaboration of either ei, ej, or some event between them."

Prolongational reduction preference rules (PR~PRs)

 * 1) (Time-span importance) "In choosing the prolongational most important event ek of a prolongational region (ei – ej), strongly prefer a choice in which ek is relatively time-span important."
 * 2) (Time-span segmentation) "Let ek be the prolongationally most important region (ei – ej). If there is a time-span that contains ei and ek but not ej, prefer a prolongational reduction in which ek is an elaboration of ei; similarly with the roles of ei and ej reversed."
 * 3) (Prolongational connection) "In choosing the prolongationally most important region (ei – ej), prefer an ek that attaches to as to form a maximally stable prolongational connections with one of the endpoints of the region."
 * 4) (Prolongational importance) "Let ek be the prolongationally most important region (ei – ej). Prefer a prolongational reduction in which ek is an elaboration of the prolongationally more important of the endpoints."
 * 5) (Parallelism) "Prefer a prolongational reduction in which parallel passages receive parallel analyses."
 * 6) (Normative prolongational structure) "A cadenced group preferably contains four (five) elements in its prolongational structure: a. a prolongational beginning; b. a prolongational ending consisting of one element of the cadences; (c. a right-branching prolongational as the most important direct elaboration direct of the prolongational beginning); d. a right-branching progression as the (next) most important direct elaboration of the prolongational beginning; e. a left-branching ‘subdominant’ progression as the most important elaboration of the first element of the cadence."

Prolongational reduction transformational rules

 * 1) Stability conditions for prolongational connection (p. 224): a. Branching condition; b. Pitch-collection condition; c. Melodic condition; d. Harmonic condition.
 * 2) Interaction principle: "to make a sufficiently stable prolongational connection ek must be chosen from the events in the two most important levels of time-span reduction represented in (ei – ej)."

Lerdahl

 * Lerdahl, Fred (1987). "Timbral Hierarchies". Contemporary Music Review 2, no. 1, p. 135–160.
 * Lerdahl, Fred (1989). "Atonal Prolongational Structure". Contemporary Music Review 3, no. 2. p. 65–87.
 * Lerdahl, Fred (1992). "Cognitive Constraints on Compositional Systems". Contemporary Music Review 6, no. 2, p. 97–121.
 * Lerdahl, Fred (Fall 1997). "Spatial and Psychoacoustic Factors in Atonal Prolongation". Current Musicology 63, p. 7–26.
 * Lerdahl, Fred (1998). "Prolongational Structure and Schematic Form in Tristan's Alte Weise". Musicae Scientiae, p. 27–41.
 * Lerdahl, Fred (1999). "Composing Notes". Current Musicology 67–68, p. 243–251.
 * Lerdahl, Fred (Autumn 2003). "Two Ways in Which Music Relates to the World". Music Theory Spectrum 25, no. 2, p. 367–373.
 * Lerdahl, Fred (2001). Tonal Pitch Space. New York: Oxford University Press. 391 pages. (This volume includes integrated and expanded versions of these articles: Lerdahl, Fred (Spring/Fall, 1988). "Tonal Pitch Space". Music Perception 5, no. 3, p. 315–350; and Lerdahl, Fred (1996). "Calculating Tonal Tension". Music Perception 13, no. 3, p. 319–363.)
 * Lerdahl, Fred (2009): "Genesis and Architecture of the GTTM Project". Music Perception 26(3),, pp. 187–194.

Jackendoff

 * Jackendoff, Ray (1987): Consciousness and the Computational Mind. Cambridge: MIT Press. Chapter 11: "Levels of Musical Structure".
 * Jackendoff, Ray (2009): "Parallels and Nonparallels Between Language and Music". Music Perception 26(3), pp. 195–204.

Lerdahl and Jackendoff

 * (Autumn 1979 – Summer 1980). "Discovery Procedures vs. Rules of Musical Grammar in a Generative Music Theory". Perspectives of New Music 18, no. ½, p. 503–510.
 * (Spring 1981). "Generative Music Theory and Its Relation to Psychology". Journal of Music Theory (25th anniversary issue) 25, no. 1, p. 45–90.
 * (October 1981). "On the Theory of Grouping and Meter". The Musical Quarterly 67, no. 4, p. 479–506.
 * (1983). "An Overview of Hierarchical Structure in Music". Music Perception 1, no. 2.

Reviews of GTTM

 * Child, Peter (Winter 1984). "Review of A Generative Theory of Tonal Music, by Fred Lerdahl and Ray Jackendoff". Computer Music Journal 8, no. 4, p. 56–64.
 * Clarke, Eric F. (April 1986). "Theory, Analysis and the Psychology of Music: A Critical Evaluation of Lerdahl, F. and Jackendoff, R., A Generative Theory of Tonal Music". Psychology of Music 14, no. 1, pp. 3–16.
 * Feld, Steven (March 1984). "Review of A Generative Theory of Tonal Music, by Fred Lerdahl and Ray Jackendoff". Language in Society 13, no. 1, p. 133–135.
 * Hantz, Edwin (Spring 1985). "Review of A Generative Theory of Tonal Music, by Fred Lerdahl and Ray Jackendoff". Music Theory Spectrum 1, p. 190–202.

Bibliography on automation of GTTM

 * Keiji Hirata, Satoshi Tojo, Masatoshi Hamanaka. An Automatic Music Analyzing System based on GTTM.
 * Masatoshi Hamanaka, Satoshi Tojo: Interactive Gttm Analyzer, Proceedings of the 10th International Conference on Music Information Retrieval Conference (ISMIR2009), pp. 291–296, October 2009.
 * Keiji Hirata, Satoshi Tojo, Masatoshi Hamanaka: Techniques for Implementing the Generative Theory of Tonal Music, ISMIR 2007 (7th International Conference on Music Information Retrieval) Tutorial, September 2007.
 * Masatoshi Hamanaka, Keiji Hirata, Satoshi Tojo: "Implementing a Generating Theory of Tonal Music". Journal of New Music Research, vol. 35, no. 4, pp. 249–277, 2006.
 * Masatoshi Hamanaka, Keiji Hirata, Satoshi Tojo: "FATTA: Full Automatic Time-span Tree Analyzer", Proceedings of the 2007 International Computer Music conference (ICMC2007), vol. 1, pp. 153–156, August 2007.
 * Masatoshi Hamanaka, Keiji Hirata, Satoshi Tojo: "Grouping Structure Generator Based on Music Theory GTTM", Transactions of Information Processing Society of Japan, vol. 48, no. 1, pp. 284–299, January 2007 (in Japanese).
 * Masatoshi Hamanaka, Keiji Hirata, Satoshi Tojo: "ATTA: Automatic Time-span Tree Analyzer based on Extended GTTM", Proceedings of the 6th International Conference on Music Information Retrieval Conference (ISMIR2005), pp. 358–365, September 2005.
 * Masatoshi Hamanaka, Keiji Hirata, Satoshi Tojo: "Automatic Generation of Metrical Structure based on GTTM", Proceedings of the 2005 International Computer Music conference (ICMC2005), pp. 53–56, September 2005.
 * Masatoshi Hamanaka, Keiji Hirata, Satoshi Tojo: "Automatic Generation of Grouping Structure based on the GTTM", Proceedings of the 2004 International Computer Music conference (ICMC2004), pp. 141–144, November 2004.
 * Masatoshi Hamanaka, Keiji Hirata, Satoshi Tojo: "An Implementation of Grouping Rules of the GTTM: Introducing of Parameters for Controlling Rules". Information Processing Society of Japan SIG Technical Report, vol. 2004, no. 41, pp. 1–8, May 2004 (in Japanese).
 * Lerdahl, F., & C. L. Krumhansl (2007). "Modeling Tonal Tension". Music Perception 24.4, pp. 329–366.
 * Lerdahl, F. (2009). "Genesis and Architecture of the GTTM Project". Music Perception 26, pp. 187–194.