Arbitrary slice ordering

Arbitrary slice ordering (ASO) in digital video, is an algorithm for loss prevention. It is used for restructuring the ordering of the representation of the fundamental regions (macroblocks) in pictures. This type of algorithm avoids the need to wait for a full set of scenes to get all sources. Typically considered as an error/loss robustness feature.

This type of algorithm is included as tool in baseline profile the H.264/MPEG-4 AVC encoder with I Slices, P Slices, Context Adaptative Variable Length Coding (CAVLC), grouping of slices (Slice Group), arbitrary slice order (ASO) and Redundancy slices.

Applications
Primarily for lower-cost applications with limited computing resources, this profile is used widely in videoconferencing, mobile applications and security applications also.

Arbitrary Slice Ordering (ASO) relaxes the constraint that all macroblocks must be sequenced in decoding order, and thus enhances flexibility for low-delay performance important in teleconferencing applications and interactive Internet applications.

Problems
If ASO across pictures is supported in AVC, serious issues arise: slices from different pictures are interleaved. One possible way to solve these issues is to limit ASO within a picture, i.e. slices from different pictures are not interleaved.

However, even if we limit ASO within a picture, the decoder complexity is significantly increased. Because Flexible Macroblock Order FMO extend the concept of slices by allowing non-consecutive macroblocks to belong to the same slice, this section also addresses the decoder complexity introduced by (FMO).

Association of macroblocks to slice

 * Impact of ASO on AVC decoders complexity

An example of how macroblocks can be associated to different slices is shown in Figure 1. When ASO is supported, the four slices of this example can be received by the decoder in a random order. Figure 2 shown the following receiving order: slice #4, slice #3, slice #1, and slice #2. The same figure presents the AVC decoder blocks required to support ASO decoding.



Figure 1: An example of macroblock assignment to four slices. Each slice is represented by a different texture.



Figure 2: The AVC decoder blocks need to support ASO decoding.

For each slice, the slice length and the macroblock address (i.e. index with respect to the raster scan order) of the first macroblock (MB) of the slice are extracted by the slice parser (Figure 2). This information, together with the slice itself, is stored in memory (shown as DRAM). In addition, a list of pointers (Figure 2, a pointer for each slice, and each pointing to the memory location where a slice is stored), should be generated. The list of pointers, together with the address of the first macroblock of the slice, will be used to navigate through the out of order slices. The slice length will be used to transfer the slice data from the DRAM to the decoder's internal memory.

Faced with the necessity to decode out of order slices, a decoder may:


 * 1) wait for all the slices of each picture to arrive before start decoding and de-blocking the picture.
 * 2) decode the slices in the order in which they come to the decoder.

The first method increases latency, but allows performing decoding and de-blocking in parallel. However, managing a large number of pointers (in the worst case, one pointer for each MB) and increasing the intelligence of the DRAM access unit increase the decoder complexity.

The second method hurts significantly the decoder performance. In addition, by performing the de-blocking in a second pass, the DRAM to processor's memory bandwidth is increased.

Decoding slices in the order they are received can result in additional memory consumption or impose higher throughput requirements on the decoder and local memory to run at higher clock speed. Consider an application in which the display operation reads the pictures to be displayed right from the section of memory where the decoder stored the pictures.

Association of macroblocks to slice and slices to group of slices

 * Impact of ASO and FMO on AVC decoders complexity

An example of how slices can be associated to different slice group is shown in Figure 3. When ASO and FMO are supported, the four slices of this example can be received by the decoder in a random order. Figure 2 shown the following order: slice #4, slice #2, slice #1, and slice #3. The same figure presents the AVC decoder blocks required to support ASO and FMO decoding.



Figure 3: An example of macroblock assignment to four slices and to two Slice Group (SG in the figure). Each slice is represented by a different texture, and each Slice Group is represented a different color.



Figure 4: The AVC decoder blocks need to support ASO and FMO decoding.

In addition to the slice length and the macroblock address of the 1st macroblock (MB) of the slice, the slice parser (Figure 4) need to extract the Slice Group (SG) of each slice. These informations, together with the slice itself, are stored in DRAM. As in the ASO case, the list of pointers (Figure 4) should be generated.

The list of pointers, together with the address of the 1st MB of the slice, the SG, and the mb_allocation_map (stored in the processor's local memory), will be used to navigate through the slices. The slice length will be used to transfer the slice data from the DRAM to the processor local memory.

Similarly to the ASO case, in the combined ASO and FMO case the decoder may:


 * 1) wait for all the slices of each picture to arrive before start decoding and de-blocking the picture.
 * 2) decode the slices in the order in which they come to the decoder.

The first approach is still the preferred one. Because of FMO, decoding macroblocks in raster scan order may require to switch between different slices and/or slice groups. To speed up the DRAM access, one buffer for each Slice Group must be used (Figure 4). This additional intelligence of the DRAM access unit further increase the decoder complexity. Moreover, switching between different slices and/or slice groups requires swapping the Entropy Decoder (ED) status information. In the worst case, swapping occurs after decoding each macroblock. If the entire Entropy Decoder status information is too large to be stored in the processor local memory, each ED status need to be loaded from and stored into DRAM, thus further increasing the DRAM to processor's memory bandwidth (Figure 4).