Ernst Terhardt

Ernst Terhardt (born 11 December 1934) is a German engineer and psychoacoustician who made significant contributions in diverse areas of audio communication including pitch perception, music cognition, and Fourier transformation. He was professor in the area of acoustic communication at the Institute of Electroacoustics, Technical University of Munich, Germany.

Education
Terhardt studied electrical engineering at the University of Stuttgart. His Master's thesis (Diplomarbeit) was entitled "Ein Funktionsmodell des Gehörs" (A functional model of hearing). His Dissertation was entitled "Beitrag zur Ermittlung der informationstragenden Merkmale von Schallen mit Hilfe der Hörempfindungen" (literally, "Contribution to determination of information-carrying characteristics of sounds with the help of auditory sensations"). Both projects were supervised by Eberhard Zwicker, with whom he founded the Institute for Electroacoustics, Technical University of Munich in 1967. Terhardt's Habilitation thesis (1972) was entitled "Ein Funktionsschema der Tonhöhenwahrnehmung von Klängen" (A model of pitch perception in complex sounds).

Pitch perception
According to Terhardt's theory of pitch perception, pitch perception can be divided into two separate stages: auditory spectral analysis and harmonic pitch pattern recognition. In the first stage, the inner ear (cochlea and basilar membrane) performs a running spectral analysis of the incoming signal. The parameters of this analysis (e.g. the effective length and shape of the analysis window) depend directly on physiology and indirectly on the co-evolution of ear and voice as our human and prehuman ancestors interacted with their social and physical environments. The output of this first stage is called a spectral pitch pattern, when it is determined by psychoacoustic experiments in which listeners make subjective judgments, matching the perceived pitch of a pure reference tone to that of a successively presented complex tone. The spectral pitches differ in perceptual salience since their sound pressure levels differ physically, they lie at different distances above the threshold of hearing, they mask each other (and therefore lie at different distances above the masked threshold), and may or may not lie in a region to which the ear is particularly sensitive (a dominance region of pitch perception). A cornerstone of Terhardt's approach is the idea that because spectral pitches are subjective, we must not jump to conclusions about the relationship between them and their physiological (physical) foundations in the ear and brain.

In second stage of pitch perception, harmonic patterns among the spectral pitches are spontaneously recognized by the auditory system, in a process analogous to pattern recognition in vision. The output of this stage is a set of virtual pitches that correspond approximately to the fundamentals of approximately harmonic series of pitches. In this process, the auditory system tolerates a certain degree of mistuning, for two main reasons. First, the partials of complex tones in the environment may be physically mistuned relative to a harmonic series (e.g. piano tones). Second, the frequencies of partials may be known only approximately due to the uncertainty principle: the shorter is the effective time window, the less accurately can the frequency be known. The auditory system is physically unable to determine frequencies accurately in very short sound presentations, or in tones that are changing quickly in fundamental frequency, for example in speech.

If only one virtual pitch is perceived in a sound, it is generally the one with the highest salience. The output of Terhardt's algorithm for pitch perception is a series of virtual pitches of differing salience, of which the most salient is the prediction for “the” pitch of the sound. The existence of several competing virtual pitches can explain the ambiguity of the pitch of many sounds. Bells with non-harmonic spectra are an obvious example (it is often possible to hear the main virtual pitch as the strike tone at the start of the sound, and the main spectral pitch as a hum tone which becomes directly audible as the sound dies away). But Terhardt and his colleagues also demonstrated that regular harmonic complex tones in speech and music are slightly ambiguous in pitch, which may be the ultimate origin of octave equivalence in music and the perceived tonal affinity of successive tones at octave or fifth intervals. Terhardt claimed that the root of a chord in western music typically corresponds to its most salient virtual pitch, and that the virtual pitch phenomenon is the ultimate origin of the root effect. He also investigated the perception of roughness in music and claimed that musical consonance and dissonance has two main psychoacoustic components, roughness and harmony, harmony being related to the perception of virtual pitch.

Acoustic communication
Terhardt's approach to acoustic communication is based on Karl Popper's theory of three worlds according to which reality is either physical, experiential (perception, sensations, emotions) or abstract (thoughts, knowledge, information, culture). Terhardt maintains that these three aspects of acoustic communication must be carefully separated before we empirically explore the relationships between them. In the physical world, we consider the physics of sound sources such as the voice and musical instruments; auditory environments including reflectors; electroacoustic systems such as microphones and loudspeakers; and the ear and brain, considered as a purely physical system. Sound is a signal that is analysed by the ear; to understand this process, we need foundations of signal processing. To understand auditory perception, we perform psychoacoustic experiments, which are generally about relationships between and among Popper's three worlds.