Moving image formats

This article discusses moving image capture, transmission and presentation from today's technical and creative points of view; concentrating on aspects of frame rates.

Essential parameters
The essential parameters of any moving image sequence as a visual presentation are: presence or absence of colour, aspect ratio, resolution and image change rate.

Image change rate
There are several standard image-change rates (or frame rates) used today: 24 Hz, 25 Hz, 30 Hz, 50 Hz, and 60 Hz. Technical details related to the backward-compatible addition of color to the NTSC signal caused other variants to appear: 24000/1001 Hz, 30000/1001 Hz, and 60000/1001 Hz.

The image change rate fundamentally affects how "fluid" the motion it captures will look on the screen. Moving image material, based on this, is sometimes divided into two groups: film-based material, where the image of the scene is captured by camera 24 times a second (24 Hz), and video-based material, where the image is captured roughly 50 or 60 times a second.

The roughly 50 and 60 Hz material captures motion very well, and it looks very fluid on the screen. In principle, the 24 Hz material conveys motion satisfactorily; but, because it is usually displayed at least twice the capture rate in cinema and on CRT TV (to avoid flicker), it is not considered capable of transmitting "fluid" motion. Nevertheless, it still is used to film movies, because of the unique artistic impression arising exactly from the slow image-change rate.

25 Hz material, for all practical purposes, looks and feels the same as 24 Hz material. 30 Hz material is in the middle, between 24 and 50 Hz material, in terms of "fluidity" of the motion it captures; but, in TV systems, it is handled similarly to 24 Hz material (i.e. displayed at least twice the capture rate).

Capture
The capture process fixes the "natural" frame rate of the image sequence. Moving image sequence can be captured at the rate which is different from presentation rate, however this is usually only done for the sake of artistic effect, or for studying fast-pace or slow processes. In order to faithfully reproduce familiar movements of persons, animals, or natural processes, and to faithfully reproduce accompanying sound, the capture rate must be equal to, or at least very close to the presentation rate.

All modern moving image capture systems either use a mechanical or an electronic shutter. Shutter allows the image for a single frame to be integrated over a shorter period of time than the image change period. Another important function of the shutter in raster-based systems is to make sure that the part of frame scanned first (e.g. the topmost part) contains image of the scene integrated over exactly the same period of time as the part of frame scanned last.

Early TV cameras, such as the video camera tube, did not have a shutter. Not using shutter in raster systems may alter the shape of the moving objects on the screen. On the other hand, the video from such a camera looks shockingly "live" when displayed on a CRT display in its native format.

Transmission
Analog broadcasting systems—PAL/SECAM and NTSC—were historically limited in the set of moving image formats they could transmit and present. PAL/SECAM can transmit 25 Hz and 50 Hz material, and NTSC can only transmit 30 Hz and 60 Hz material (later replaced by 30/1.001 and 60/1.001 Hz). Both systems were also limited to an aspect ratio of 4:3 and fixed resolution (limited by the available bandwidth). While the wider aspect ratios were relatively straightforward to adapt to 4:3 frame (for instance by letterboxing), the frame rate conversion is not straightforward, and in many cases degrades the "fluidity" of motion, or quality of individual frames (especially when either the source or the target of the frame rate conversion is interlaced or inter-frame mixing is involved in the rate conversion).

50 Hz television systems
Material for local TV markets is usually captured at 25 Hz or 50 Hz. Many broadcasters have film archives of 24 frame/s (film speed) content related to news gathering or television production.

Live broadcasts (news, sports, important events) are usually captured at 50 Hz. Using 25 Hz (de-interlacing essentially) for live broadcasts makes them look like they are taken from an archive, so the practice is usually avoided unless there is a motion processor in the transmission chain.

Usually 24 Hz material from film is usually sped up by 4%, when it is of feature film origin. The sound is also raised in pitch slightly as a result of the 4% speedup but pitch correction circuits are typically used.
 * Older technology allows an alternative option where every 12th film frame is held for three video fields instead of two mostly fixing the problem.
 * More modern film playback technology allows for every 25th frame to be interpolated, with less objectionable results and no need for pitch modification.
 * Each of these film oriented content transmission techniques has its own drawbacks. However modern motion compensation processors are considered to produce the least objectionable output.

With roughly 30 or 60 Hz material, imported from 60 Hz systems, is usually adapted for presentation at 50 Hz by adding duplicate frames or dropping excessive frames, sometimes also involving intermixing consecutive frames. Nowadays, digital motion analysis, although complex and expensive, can produce a superior-looking conversion (though not absolutely perfect).

60 Hz television systems
Because of higher television production budgets in the US, and a preference for the look of film, many prerecorded TV shows were, in fact, captured onto film at 24 Hz.

Source material filmed at 24 Hz is converted to roughly 60 Hz using the technique called 3:2 pulldown, which includes inserting variable number of duplicate frames, with additional slowdown by the factor of 1.001, if needed. Occasionally, inter-frame mixing is used to smooth the judder.

Live programs are captured at roughly 60 Hz. In the last 15 years, 30 Hz has also become a feasible capture rate when a more "film like" look is desired, but ordinary video cameras are used. Capture on video at the film rate of 24 Hz is an even more recent development, and mostly accompanies HDTV production. Unlike 30 Hz capture, 24 Hz cannot be simulated in post production. The camera must be natively capable of capturing at 24 Hz during recording. Because the ~30 Hz material is more "fluid" than 24 Hz material, the choice between ~30 and ~60 rate is not as obvious as that between 25 Hz and 50 Hz. When printing 60 Hz video to film, it has always been necessary to convert it to 24 Hz using the reverse 3:2 pulldown. The look of the finished product can resemble that of film, however it is not as smooth, (particularly if the result is returned to video) and a badly done deinterlacing causes image to noticeably shake in vertical direction and lose detail.

References to "60 Hz" and "30 Hz" in this context are shorthand, and always refer to the 59.94 Hz or 60 x 1000/1001 rate. Only black and white video and certain HDTV prototypes ever ran at true 60.000 Hz. The US HDTV standard supports both true 60 Hz and 59.94 Hz; the latter is almost always used for better compatibility with NTSC.

25 or 50 Hz material, imported from 50 Hz systems, can be adapted to 60 Hz similarly, by dropping or adding frames and intermixing consecutive frames. The best quality for 50 Hz material is provided by digital motion analysis.

Modern digital systems
Digital video is free of many of the limitations of analog transmission formats and presentation mechanisms (e.g. CRT display) because it decouples the behavior of the capture process from the presentation process. As a result, digital video provides the means to capture, convey and present moving images in their original format, as intended by directors (see article about purists), regardless of variations in video standards.

Frame grabbers that employ MPEG or other compression formats are able to encode moving image sequences in their original aspect ratios, resolution and frame capture rates (24/1.001, 24, 25, 30/1.001, 30, 50, 60/1.001, 60 Hz). MPEG—and other compressed video formats that employ motion analysis—help to mitigate the incompatibilities among the various video formats used around the world.

At the receiving end, a digital display is free to independently present the image sequence at a multiple of its capture rate, thus reducing visible flicker. Most modern displays are "multisync," meaning that they can refresh the image display at a rate most suitable for the image sequence being presented. For example, a multisync display may support a range of vertical refresh rates from 50 to 72 Hz, or from 96 to 120 Hz, so that it can display all standard capture rates by means of an integer rate conversion.

Presentation
There are two kinds of displays on the market today: those which "flash" a picture for a short part of the refresh period (CRT, cinema projector), and those which display an essentially static image between the moments of refreshing it (LCD, DLP).

The "flashing" displays must be driven at least 48 Hz, although today, a rate significantly below 85 Hz is not considered ergonomic.

For these displays, the 24–30 Hz material is usually displayed at 2x, 3x, or 4x the capture rate. 50 and ~60 Hz material is usually displayed at its native rate, where it delivers a very accurate motion without any smearing. It can also be displayed at twice the capture rate, although moving objects will look smeared or trailed, unless intermediate frames are calculated using the motion analysis and are not just simply duplicated.

The "continuous" display can be driven at any integer multiple of the capture rate - it won't matter for the viewer, nor can it be visually discriminated. However, in general, "continuous" displays show noticeable smear over quickly-moving objects in 50 and ~60 Hz video material (even if their response time is instant). However, there are two emerging techniques to combat smearing of the video-based material in LCD display: it can be effectively converted into the "flashing" display by appropriately modulating its backlight; and/or it can be driven at double the capture rate while calculating intermediate frames using the motion analysis (see LCD television).

Obviously, when presentation rate is not an integer multiple of the capture rate, the "fluidity" of the motion on the screen will suffer to a varying degree (terribly for video-, unpleasantly for film-based material). This is usually the case with computer-based DVD players and PAL PC TVs, where the user does not switch the refresh rate either out of ignorance, or due to technical constraints; which sometimes are, in fact, artificial, made by manufacturers counting on that user's ignorance. For instance some laptop LCD panels cannot be (easily) switched to anything but a 60 Hz refresh rate, and some LCD displays with DVI input refuse to accept digital input signal if its vertical refresh rate does not fit between 58 and 62 Hz.

Most software DVD players do not assist with switching display modes, and even if it is switched manually, they hardly synchronize frame updating with the display's vertical retrace periods. (There is only soft synchronization using hardware double buffering, which is not enough to match hardware players in the stability of playback.)

50 vs. 60 Hz
60 Hz material captures motion a bit more "smoother" than 50 Hz material. The drawback is that it takes approximately 1/5 more bandwidth to transmit, if all other parameters of the image (resolution, aspect ratio) are equal. "Approximately", because interframe compression techniques, such as MPEG, are a bit more efficient with higher frame rates, because the consecutive frames also become a bit more similar.

There are, however, technical and political obstacles for adopting a single worldwide video format. The most important technical problem is that quite often the lighting of the scene is achieved with lamps which flicker at a rate related to the local mains frequency. For instance the mercury lighting used in stadia (twice the mains frequency). Capturing video under such conditions must be done at a matching rate, or the colours will flicker badly on the screen. Even an AC incandescent light may be a problem for a camera if it is underpowered or near the end of its useful life.

The necessity to select a single universal video format (for the sake of the global material interchange) should anyway become irrelevant in the digital age. The director of video production would then be free to select the most appropriate format for the job, and a video camera would become a global instrument (currently the market is very fragmented).