Chapter 2 Overview of H.264/AVC standard
2.2 INTRA PREDICTION
Intra-prediction uses the high correlation property of neighboring samples in spatial domain to predict the current encoded samples. For the luma samples, each prediction block may be formed for each 4x4 block (denoted as I4MB) or for an entire MB (denoted as I16MB). When utilizing Intra_4x4 prediction, each 4x4 block chooses one of the nine prediction modes, which include one DC mode plus eight directional prediction modes, as shown in Fig 3 (a), as the best one. In the luma component of an MB, the Intra_16x16 prediction is typically chosen for smooth image areas, and thus, only four prediction modes are specified as shown in Fig 3 (b) except for the DC mode. The chroma samples of an MB are predicted using a similar prediction pattern, Intra_8x8, which is similar to the luma Intra_16x16 prediction.
(a) (b)
Fig 3 Intra prediction modes for (a)Intra_4x4 and (b) Intra_16x16.
2.2.2 Fast algorithms
The fast algorithms of intra prediction can be classified into several types. The first approach is “early termination”, which ends the search operation when the calculated distortion is samller than a pre-chosen threshold. The selection of a proper measure for deciding termination is critical to the performance. It may be derived based on the macroblock smoothness [6][7] or the most probable mode [8]. The early termination based on the macroblock smoothness calculates a smoothness measure of a macroblock to determine the block type. For example, the large block type such as Intra_16x16 is chosen often for the flat image areas [6][7]. “Smooth” means that all the pixel values in a MB are similar; that is, their variance is small. The variance computation shall be simple to save computation. Therefore, the Mean Absolute Difference (MAD) operation [6] or the AC/DC ratio [7] is often used. If the variable is
smaller than a pre-selected threshold value, the Intra_16x16 mode is chosen and thus the costly Intra_4x4 can be skipped.
Another kind of early termination proposal examines the most probable mode first. For example, in searching for the best Intra_4x4 mode, if its residual is smaller than a threshold, then the other eight Intra_4x4 modes are skipped (not chosen).
Otherwise, all nine modes have to be tested. Then, we set another threshold to decide whether to keep on checking the Intra_16x16 prediction or not. It was reported that in one case, this method together with the 2:1 downsampling and rate-distortion optimization (RDO) can reduce 68.8% of total computation time with only 1.35% of bit rate increase comparing to the reference software [8]. The major issue in this type of algorithms is how to determine the threshold. The threshold value can be adjusted according to the quantization parameters for instance. To construct a more efficient scheme, we propose a mixed fast intra prediction algorithm. It first examines both the most probable mode and the DC mode to determine if it meets the early termination criterion. The threshold value is decided by the average of SATD (sum of absolute transformed difference) of all the previous Intra_4x4 blocks in this frame. Once the 16 Intra_4x4 blocks are done, their total cost will be used as the threshold for deciding Intra_16x16 mode. These threshold values seem to be able to match the video local characteristics and provide good results. Even when RDO is turned off, we can achieve around 30% computational savings for the intra prediction module.
The second approach uses the edge analysis to quickly identify the edge direction since the intra prediction is basically a directional prediction [9][10]. Often the Sobel operators or the first order derivative are used as the edge analysis tool to find the most probable edge, which will be used as one of the final edge candidates. The final mode candidate list includes the one selected by the edge detector together with the other highly probable modes. In the case Intra_4x4, this would mean two modes of the neighboring blocks and the DC mode; and in the Intra_16x16 and Intra_8x8 cases, only the DC mode is considered highly probable. Therefore, only four candidate modes (for Intra_4x4) or two candidate modes (other types) are needed to be examined. The result shows that 60% of intra_only computation time reduction is observed with RDO and the bit rate increase is around 2~3% [9]. The bit rate increase may be owing to the irregular edges within a block. On the other side, the extra computation needed for edge analysis can be a computation burden and reduce the overall saving significantly.
The third approach uses the so-called three step approach [11]. It first tests the
horizontal and vertical directions, it then tests the neighboring 22.5 degree modes close to the better one from the previous step, and finally the best mode up-to-now is checked against the DC mode for the final winner. This approach has the advantage of a fixed number of modes are examined for all cases. However, computation time reduction is around 33% with about 1% bit rate increase.
The last approach makes use of the correlation in the temporal domain [12] since the best prediction mode in the current macroblock is likely similar to that in the reference macroblock in the previously coded frame(s). Thus, the primary intra prediction mode is selected from the mode of the most overlapped block in motion estimation. The computational overhead is nearly zero since all information is obtained during the inter-prediction operation. It is reported that the coding performance is nearly unchanged while the computational savings is about 50%
assuming the intra-frame period is 10 [12].
In summarizing various fast intra-prediction algorithms, although we cite the experimental results from the proposed documents, a fair comparison among all methods is difficult because their simulation environments are quite different. One important element affecting computation is the option of RDO in the reference software. This is particularly true for the early termination method with thresholds.
The algorithms described in the above can be combined together to achieve further speed-up. For example, the first step could be the decision on Intra_4x4 or Intra_16x16. The second step could be the early termination for the chosen intra type.
Finally, the rest of mode tests could be a fast algorithm to select one from the nine or four candidate modes.
2.3 INTER PREDICTION