Codec listening test

A codec listening test is a scientific study designed to compare two or more lossy audio codecs, usually with respect to perceived fidelity or compression efficiency.

Most tests take the form of a double-blind comparison. Commonly used methods are known as "ABX" or "ABC/HR" or "MUSHRA". There are various software packages available for individuals to perform this type of testing themselves with minimal assistance.

ABX test
In an ABX test, the listener has to identify an unknown sample X as being A or B, with A (usually the original) and B (usually the encoded version) available for reference. The outcome of a test must be statistically significant. This setup ensures that the listener is not biased by their expectations, and that the outcome is not likely to be the result of chance. If sample X cannot be determined reliably with a low p-value in a predetermined number of trials, then the null hypothesis cannot be rejected and it cannot be proved that there is a perceptible difference between samples A and B. This usually indicates that the encoded version will actually be transparent to the listener.

ABC/HR test
In an ABC/HR test, C is the original which is always available for reference. A and B are the original and the encoded version in randomized order. The listener must first distinguish the encoded version from the original (which is the Hidden Reference that the "HR" in ABC/HR stands for), prior to assigning a score as a subjective judgment of the quality. Different encoded versions can be compared against each other using these scores.

MUSHRA
In MUSHRA (MUltiple Stimuli with Hidden Reference and Anchor), the listener is presented with the reference (labeled as such), a certain number of test samples, a hidden version of the reference and one or more anchors. The purpose of the anchor(s) is to make the scale be closer to an "absolute scale", making sure that minor artifacts are not rated as having very bad quality.

Results
Many double-blind music listening tests have been carried out. The following table lists the results of several listening tests that have been published online. To obtain meaningful results, listening tests must compare codecs' performance at similar or identical bitrates, since the audio quality produced by any lossy encoder will be trivially improved by increasing the bitrate. If listeners cannot consistently distinguish a lossy encoder's output from the uncompressed original audio, then it may be concluded that the codec has achieved transparency.

Popular formats compared in these tests include MP3, AAC (and extensions), Vorbis, Musepack, and WMA. The RealAudio Gecko, ATRAC3, QDesign, and mp3PRO formats appear in some tests, despite much lower adoption. Many encoder and decoder implementations (both proprietary and open source) exist for some formats, such as MP3, which is the oldest and best-known format still in widespread use today.