Talk:Ambisonic decoding

Draft for new article
Ok, since this is probably waaay over my head, I'm using this talk page to flesh out a new structure for this article, rather than zapping the old one outright. Comments welcome, Nettings (talk) 17:39, 2 January 2014 (UTC)

Ambisonic Decoding
Before Ambisonic surround sound material can be listened to, it must be decoded. Using knowledge about the number and location of the available speakers, optimized speaker feeds are generated from the B-format signal. This is a special feature of Ambisonics which helps to decouple the mixing intent (in terms of the desired directions of sounds) from the speaker system used for reproduction, and distinguishes it from other approaches such as 5.1 surround sound, which deliver speaker signals to the consumer that mandate a pre-defined speaker layout.

Ambisonic decoder design has undergone dramatic development since the method was originally developed in the 1970s, and the quality of the decoders available today varies accordingly. This page tries to summarize the fundamental prerequisites to correct Ambisonic decoding as understood today.

Fundamentals
Matrix inversion by Moore-Penrose pseudoinverse, very well described in BLaH3

Goals:
 * Uniform energy over all directions (no loudness jumps as a source is panned)
 * congruence of rV and rE
 * rV=1 below 700 Hz
 * max rE between 700 and 4kHz as per VIENNA paper

Decoding approaches

 * in-phase decoder for large auditoria (no out-of-phase content from the opposite, aka "overlap", cf. Gerzon Tetrahedral experiment)
 * max rV decoder to satisfy LF ITD localisation (Makita, Gerzon Gen. Meta)
 * max rE decoder to satisfy HF ILD localisation (Gerzon Gen. Meta)
 * explain "mode matching" decoder (Zotter et al, Technicolor, Hannemann) (good explanation in Zotter and Frank, "All-Round Ambisonic Panning and Decoding" JAES Vol. 60, No. 10, 2012 October)
 * energy-preserving decode", Zotter/Frank
 * browse literature for other jargon terms and explain.
 * explain SHELF fiters as per Lee and equivalence to dual-band decoders
 * explain phase matching between LF and HF (again, BLaH3)
 * near-field compensation and distance coding (Daniel 2009)
 * hemispherical decodes, avoidance of "pull-up",

Parametric decoders
The idea behind parametric decoding is to treat the sound's direction of incidence as a parameter that can be estimated through time–frequency analysis. A large body of research into human spatial hearing suggests that our auditory cortex applies similar techniques in its auditory scene analysis, which explains why these methods work.

The major benefits of parametric decoding is a greatly increased angular resolution and the separation of analysis and synthesis into separate processing steps. This separation allows B-format recordings to be rendered using any panning technique, including delay panning, VBAP and HRTF-based synthesis.

Parametric decoding was pioneered by Lake DSP in the late 1990s and independently suggested by Farina and Ugolotti in 1999. Later work in this domain includes the DirAC method and the Harpex method. FIXME: cite harpex paper rather than website

Mathematical challenges

 * closed-form solutions for regular polyhedra
 * numerical solutions for irregular layouts, BLaH4+6
 * hemispherical decoding, Musil (assume non-existent loudspeakers, distribute their energy over existing ones), Zotter et al./Keiler & Batke: T-Designs with "virtual" VBAP