Wave field synthesis



Wave field synthesis (WFS) is a spatial audio rendering technique, characterized by creation of virtual acoustic environments. It produces artificial wavefronts synthesized by a large number of individually driven loudspeakers from elementary waves. Such wavefronts seem to originate from a virtual starting point, the virtual sound source. Contrary to traditional phantom sound sources, the localization of WFS established virtual sound sources does not depend on the listener's position. Like as a genuine sound source the virtual source remains at fixed starting point.

Physical fundamentals
WFS is based on the Huygens–Fresnel principle, which states that any wavefront can be regarded as a superposition of spherical elementary waves. Therefore, any wavefront can be synthesized from such elementary waves. In practice, a computer controls a large array of individual loudspeakers and actuates each one exactly by the time and level, at which the desired virtual wavefront would pass through its point. By that way from a mono signal source a genuine wave front of a sound source may by restored.

The basic procedure was developed in 1988 by Professor A.J. Berkhout at the Delft University of Technology. Its mathematical basis is the Kirchhoff–Helmholtz integral. It states that the sound pressure is completely determined within a volume free of sources, if sound pressure and velocity are determined in all points on its surface.


 * $$\boldsymbol{P}(w,z)=\iint_{dA} \left(G(w,z \vert z') \frac{\partial}{\partial n} P(w,z')- P(w,z') \frac{\partial}{\partial n} G(w,z \vert z') \right)dz'$$

Therefore, any sound field can be reconstructed, if sound pressure and acoustic velocity are restored on all points of the surface of its volume. This approach is the underlying principle of holophony.

For reproduction, the entire surface of the volume would have to be covered with closely spaced loudspeakers, each individually driven with its own signal. Moreover, the listening area would have to be anechoic, in order to avoid sound reflections that would violate source-free volume assumption. In practice, this is hardly feasible. Because our acoustic perception is most exact in the horizontal plane, practical approaches generally reduce the array to a horizontal loudspeaker line, circle or rectangle around the listener. So origin of the synthesized wavefront restrict at any point on the horizontal plane of the loudspeakers. Real 3D audio is not possible with such loudspeaker rows. For sources behind the loudspeakers, the array will produce convex wavefronts. Sources in front of the speakers can be rendered by concave wavefronts that focus in the virtual source inside playback area and diverge again as convex wave. Hence the reproduction inside the volume is incomplete - it breaks down if the listener is situated between the speakers and the virtual source.

Procedural advantages
If overcome the restriction to the horizontal plane, it becomes possible to establish a virtual copy of a genuine sound field indistinguishable from the real sound field. Changes of the listener position in the rendition area produce the same impression as an appropriate change of location in the recording room. Two dimensionally arrays can establish parallel wavefronts, which are direct at the loudspeakers not louder as in some meter distance. The horizontal arrays can only produce cylinder waves, which lose 3 dB level at any doubling of distance. But already with that restriction the Listeners at wave field synthesis are no longer relegated to a sweet spot area within the room.

The Moving Picture Expert Group standardized the object-oriented transmission standard MPEG-4 which allows a separate transmission of content (dry recorded audio signal) and form (the impulse response or the acoustic model). Each virtual acoustic source needs its own (mono) audio channel. The spatial sound field in the recording room consists of the direct wave of the acoustic source and a spatially distributed pattern of mirror acoustic sources caused by the reflections by the room surfaces. Reducing that spatial mirror source distribution onto a few transmitting channels causes a significant loss of spatial information. This spatial distribution can be synthesized much more accurately by the rendition side.

Compared to conventional channel-orientated rendition procedures, WFS provides a clear advantage: Virtual acoustic sources guided by the signal content of the associated channels can be positioned far beyond the conventional material rendition area. This reduces the influence of the listener position because the relative changes in angles and levels are clearly smaller compared to conventional loudspeakers located within the rendition area. This extends the sweet spot considerably; it can now cover nearly the entire rendition area. WFS thus is not only compatible with, but potentially improves the reproduction for conventional channel-oriented methods.

Sensitivity to room acoustics
Since WFS attempts to simulate the acoustic characteristics of the recording space, the acoustics of the rendition area must be suppressed. One possible solution is use of acoustic damping or to otherwise arrange the walls in an absorbing and non-reflective configuration. A second possibility is playback within the near field. For this to work effectively the loudspeakers must couple very closely at the hearing zone or the diaphragm surface must be very large.

In some cases, the most perceptible difference compared to the original sound field is the reduction of the sound field to two dimensions along the horizontal of the loudspeaker lines. This is particularly noticeable for reproduction of ambiance. The suppression of acoustics in the rendition area does not complement playback of natural acoustic ambient sources.

Aliasing
There are undesirable spatial aliasing distortions caused by position-dependent narrow-band break-downs in the frequency response within the rendition range. Their frequency depends on the angle of the virtual acoustic source and on the angle of the listener to the loudspeaker arrangement:
 * $$f_{\text{alias}}=\frac{c}{\Delta x \left| \sin\Theta^{\text{sec}} - \sin\Theta^{\text{v}} \right|}$$

For aliasing-free rendition in the entire audio range a distance of the single emitters below 2 cm would be necessary. But fortunately, our ear is not particularly sensitive to spatial aliasing. A 10–15 cm emitter distance is generally sufficient.

Truncation effect
Another cause for disturbance of the spherical wavefront is the truncation effect. Because the resulting wavefront is a composite of elementary waves, a sudden change of pressure can occur if no further speakers deliver elementary waves where the speaker row ends. This causes a 'shadow-wave' effect. For virtual acoustic sources placed in front of the loudspeaker arrangement, this pressure change hurries ahead of the actual wavefront whereby it becomes clearly audible.

In signal processing terms, this is spectral leakage in the spatial domain and is caused by application of a rectangular function as a window function on what would otherwise be an infinite array of speakers. The shadow wave can be reduced if the volume of the outer loudspeakers is reduced; this corresponds to using a different window function that tapers off instead of being truncated.

High cost
A further and resultant problem is high cost. A large number of individual transducers must be very close together. Reducing the number of transducers by increasing their spacing introduces spatial aliasing artifacts. Reducing the number of transducers at a given spacing reduces the size of the emitter field and limits the representation range; outside of its borders no virtual acoustic sources can be produced.

Research and market maturity


Early development of WFS began 1988 at Delft University. Further work was carried out from January 2001 to June 2003 in the context of the CARROUSO project by the European Union which included ten institutes. The WFS sound system IOSONO was developed by the Fraunhofer Institute for digital media technology (IDMT) by the Technical University of Ilmenau in 2004.

The first live WFS transmission took place in July 2008, recreating an organ recital at Cologne Cathedral in lecture hall 104 of the Technical University of Berlin. The room contains the world's largest speaker system with 2700 loudspeakers on 832 independent channels.

Research trends in wave field synthesis include the consideration of psychoacoustics to reduce the necessary number of loudspeakers, and to implement complicated sound radiation properties so that a virtual grand piano sounds as grand as in real life.

A practical breakthrough of WFS technology only came with the X1 modules from the Berlin-based technology company Holoplot. The startup eschewed the usual restriction to a horizontal plane and installed 96 individually controlled speaker drivers in a modular system. Optimized according to WFS principles, the beams are able to deliver sound very evenly to large, arbitrarily shaped audience areas, even simultaneously with beams of different content. Because reflective surfaces are not hit unintentionally, there is hardly any reverberation even in highly reflective environments. The company's largest project to date is the Sphere in the Las Vegas Valley. The venue's sound system is made of 1,586 permanently installed X1 Matrix Arrays comprising 167,000 speaker drivers, and it will combines elementary waves into common wave fronts.