Chapter 3 Content-Aware Fast Motion Estimation Algorithm
3.4 Early Termination Algorithm (ETA)
In this section, we preset our proposed Early Termination Algorithm (ETA) in detail. In [29], Siou-Shen Lin et al. introduce the variance of motion vectors. They show the probability is about 79% in average when the variance of the current block and neighbor blocks is smaller than 3. They consider that it is high probability that the current block and the neighbor blocks might belong to the same object when the variance of the motion vectors in the neighbor blocks is small.
We exploit and modify the variance of motion vectors proposed in [29] to classify the motion
1 In [4], Viet Anh Nguyen and Yap-Pen Tan proposed a fast approach to calculate block sum by exploiting the adjacent property of the blocks.
activity of current block and neighbor blocks into simple motion and complex motion. The variance of motion vectors is defined in equation (3.3).
( ) / 4
If any of neighbor blocks is not available, MVvar is set to a large value (999999). For accuracy, we compare the MVvar with 5 instead of 3 to classify motion activity, shown in equation (3.4).
If (MVvar ≦ 5)
Mactivity = simple_motion (3.4)
Else
Mactivity = complex_motion
If motion activity is simple motion, we consider the current block and neighbor blocks are in the same object for simple. On the contrary, the current block and neighbor blocks are considered not in the same block. The SAD values of blocks within the same object should be similar and the SAD values of blocks not in the same object should be different largely. Based on the concept, the lower bound for the condition of termination is determined in equation (3.5).
If (Mactivity == simple_motion)
SAD_threshold = SAD_prediction (3.5) Else
SAD_threshold = SAD_prediction – SAD_standard_deviatoin
The SAD_prediction and SAD_standard_deviation represent the prediction of SAD of current block and the standard deviation of SAD of all blocks in the previous frame, respectively. The definitions are defined in equation (3.6) and (3.8):
( ) / 4
The SADt is the SAD value of tth block in a frame. Number_MB is the total number of MB in a frame. If there is no any neighbor block near the current block, SAD_prediction is set to a small value (-999999). Note that the SAD_prediction and SAD_standard_deviation are calculated for 16x16 macroblock. In H.264/AVC standard, there are seven block sizes used in motion estimation.
We determine the SAD_prediction and SAD_standard_deviation for other block size according to the area occupied by the block. The calculations are shown in the following rules.
Adjustment of SAD_prediction and SAD_variance for H.264/AVC standard
If (block size == 16x8 or 8x16)
SAD_prediction = SAD_prediction / 2
SAD_standard_deviation = SAD_standard_deviation / 2 Else if (block size == 8x8)
SAD_prediction = SAD_prediction / 4
SAD_standard_deviation = SAD_standard_deviation / 4 Else if (block size == 8x4 or 4x8)
SAD_prediction = SAD_prediction / 8
SAD_standard_deviation = SAD_standard_deviation / 8 Else if (block size == 4x4)
SAD_prediction = SAD_prediction / 16
SAD_standard_deviation = SAD_standard_deviation / 16
Finally, the condition of termination is tested when a new up-to-date best-matched block is found. If the SAD value of the up-to-date block is equal to or smaller than SAD_threshold, the motion estimation is terminated.
Chapter 4
Experimental Results and Discussions
In this chapter, we present the experimental results of the proposed approaches including simple dynamic search range algorithm, successive elimination algorithm with integral frame, and early termination algorithm. Finally, the experimental results of integrated algorithm called Content-Aware Fast Motion Estimation Algorithm (CAFME) are presented.
We modify the H.264/AVC reference software JM 9.4 and implement the proposed algorithms on it. In the experiments, we compare the proposed algorithm with Full Search (FS). We observe the number of search points for each block to measure the performance of the proposed algorithms.
We also measure the coding efficiency. In order to measure the coding efficiency, we compare the bitrates of encoded sequences with the same quantization parameter and disabling rate control.
Besides, we exploit the SAD value as a criterion to measure whether the determined search range is large enough. Finally, we compare the total encoding time to measure the improvement in practical situation.
4.1 Experimental Environment
In this section, we present the experimental environment. The descriptions and snapshots of test video sequences are listed in Table 4-1 and Table 4-2, respectively. Except specifically described parameters, the following parameters are applied to all experiments. Note that the maximum search range is set to 24.
Platform: H.264/AVC reference software JM 9.4 [32]
Machine: Athlon XP 1700+ with 512 MB memory Profile: baseline
Level: 3.0
Block match algorithm (BMA): Full Search Group of picture (GOP): 15
Quantization parameter (QP): 36
Frame rate (FPS): 30 Max search range: 24 Frame structure: IPPP
Number of reference frame: 1 Hadamard transform: enable
All block size (16x16, 16x8, 8x16, 8x8, 8x4, 4x8, and 4x4): enable Rate-distortion optimized (RDO): enable
Fast ME (UMHexagonS) [33]: disable Fast mode selection [34]: disable Rate control (RC): disable
Table 4-1 Descriptions of test video sequences
ID Name Resolution # of Frames Motion activity
A Foreman QCIF 150 Medium
B Mobile QCIF 150 Slow
C Coastguard QCIF 150 Medium
D Foreman CIF 150 Medium
E Tempete CIF 150 Slow, Zooming
F Flower CIF 90 Slow
G Stefan SIF 150 High
H Football CIF 90 Very High
I Table tennis SIF 90 Medium, Scene change, Zooming
Table 4-2 Snapshots of test video sequences
Foreman QCIF Mobile QCIF Coastguard QCIF
Foreman CIF Tempete CIF Flower CIF
Stefan SIF Football CIF Table tennis SIF
4.2 Opponent: Fast Full Pel Search
In our experiments, we compare our proposed algorithms with Fast Full Pel Search2. The Fast Full Pel Search is implemented by reusing SAD values of the smallest 4x4 block. Before a new macroblock is motion estimated, it computes the SAD values for all 4x4 block at all search points within the search window. After that, it merges the SAD values to get the SAD values of larger
2 The Fast Full Pel Search is implemented in H.264/AVC Reference Software JM 9.4
blocks. In this way, computation of SAD for a macroblock with all block size enabled is about equal to the computation of SAD with only a 16x16 block.
Note that the performances of the Fast Full Pel Search and a conventional Full Search are the same but the Fast Full Pel Search is faster than a conventional Full Search in H.264/AVC. In the following experiments, we denote the Fast Full Pel Search as FS.
4.3 Simple Dynamic Search Range
In this section, we experiment on nine sequences with our proposed Simple Dynamic Search Range (SDSR). The nine test sequences include the various motion activities. In Table 4-3, the proposed SDSR outperforms the Fast Full Pel Search (FS) greatly. For the low and medium motions, SDSR reduces the number of search points about 77% ~ 98%. For the high motion, SDSR reduces the size of search window much less reasonably. For example, the reduced rate is 41% for Football sequence which represents the high motion activity. In Table 4-4, we can observe that the bitrates increases slightly, except Football sequence. In the Table 4-5, the total encoding time is reduced about 40~50%, except Stefan and Football sequences. The motion activity of Stefan and Football sequences are higher than others. The Figure 4-1 presents the two successive frames of Football sequence for illustration.
In order to measure whether the search ranges determined by the SDSR is large enough. We depict Figure 4-2 and Figure 4-3 to present the measurement. The Figure 4-2 is the average SAD values frame by frame in Foreman QCIF sequence with SDSR and FS. We can observe the SAD values of SDSR and FS are very similar. This observation shows that SDSR can find true MVs in most of the motion estimations. The Figure 4-3 is the average SAD values frame by frame in Football CIF sequence with SDSR and FS. We can observe the differences between SAD values of SDSR and FFS are larger in some frames because the motion activities are much higher.
In average, the number of search points is reduced about 80%, bit rate increases about 0.11%, and total encoding time is reduced about 43%. Hence, we claim the proposed SDSR can keep almost the same coding efficiency.
Table 4-3 Search Points of FFS and SDSR Number of Search Points
Sequence Name
Fast Full Pel Search SDSR
Improvement
Table 4-4 Bitrates of FS and SDSR Bitrates (Kbps)
Sequence Name
Fast Full Pel Search SDSR
Improvement
Table 4-5 Total Encoding Time of Fast Full Pel Search and SDSR Total Encoding Time (Second)
Sequence Name
Fast Full Pel Search SDSR
Improvement
Foreman QCIF 156 74 - 53%
Mobile QCIF 151 75 - 50%
Coastguard QCIF 151 70 - 54%
Foreman CIF 602 319 - 47%
Tempete CIF 583 324 - 44%
Flower CIF 374 221 - 41%
Stefan SIF 508 340 - 33%
Football CIF 363 280 - 23%
Table tennis SIF 298 169 - 43%
Average - 43%
Figure 4-1 24th and 25th frame of football CIF sequence
Figure 4-2 SAD and SR of SDSR frame by frame in Foreman QCIF
Figure 4-3 SAD and SR of SDSR frame by frame in Football CIF
4.4 Successive Elimination Algorithm with Integral Frame
In Table 4-6 and Table 4-7, SEAIF reduces the number of search points about 76%, while the total encoding time increases about 14% in average. The reason is the overlapped block used in H.264/AVC standard. When SEAIF is applied for overlapped blocks, the condition of computing SAD is tested more than once for the same area. Hence, the probability of computing SAD rises, and then the computational cost of calculation of SAD cannot be reduced largely. In addition to the overhead of SEAIF, the encoding time is more slightly. In Table 4-8 and Table 4-9, only 16x16 block size is enabled for motion estimation. There is no overlapped block. Therefore, the performance of SEAIF is better and the encoding time is less.
Table 4-6 Search Points of FS and SEAIF (all block size enabled) Number of Search Points
Sequence Name
Fast Full Pel Search SEAIF
Improvement
Table 4-7 Total Encoding Time of Fast Full Pel Search and SEAIF (all block size enabled)
Total Encoding Time (Second) Sequence Name
Fast Full Pel Search SEAIF
Improvement
Foreman QCIF 156 157 + 0.64%
Mobile QCIF 151 172 + 14%
Tempete CIF 583 690 + 18%
Stefan SIF 508 579 + 14%
Average + 12%
Table 4-8 Search Points of FS and SEAIF (16x16 block size only) Number of Search Points
Sequence Name
Fast Full Pel Search SEAIF
Improvement
Table 4-9 Total Encoding Time of Fast Full Pel Search and SEAIF (16x16 block size only)
Total Encoding Time (Second) Sequence Name
Fast Full Pel Search SEAIF
Improvement
In the section 3.3.3, we modified the spiral search pattern in JM 9.4. The experimental result is presented in Table 4-10. Although the search patterns are different, all the search points are examined. The best-matched blocks with the same RD cost may not be the same due to the different search order. Therefore, the results in bitrate field are slightly different.
The effects of both patterns are almost the same, so we use the spiral search pattern in JM 9.4 with our proposed SEAIF.
Table 4-10 SEAIF with different spiral search patterns
Bitrate (Kbps) Total Encoding Time (Sec) Sequence Name
JM 9.4 Ours Improvement JM 9.4 Ours Improvement
Foreman QCIF 69.203 68.981 +0.3% 157 155 -1.3%
In Table 4-11, Table 4-12, and Table 4-13, our proposed Early Termination Algorithm (ETA) reduces the number of SP about 44.5% and the bit rate is nearly the same with FS. However, the encoding time is not reduced as we expect. In motion estimation, each search point is estimated in matching criterion, usually SAD. The proposed ETA terminates the searching process early to reduce the computations of SAD. In this experiment, our ETA is used with the Fast Full Pel Search algorithm3 and the algorithm calculates all SAD values in advance. Although our ETA can skip a large number of search points, it can not save the computations of SAD. So the encoding time can not be saved in this experiment. The proposed Early Termination Algorithm should be used with other algorithms instead of the algorithms computing SAD in advance.
Table 4-11 Search Points of FS and ETA Number of Search Points
3 The algorithm is proposed in H.264/AVC reference software JM9.4.
Tempete CIF 2401 1306 - 46%
Stefan SIF 2401 1350 - 44%
Average - 44.5%
Table 4-12 Bitrates of FS and ETA Bitrates (Kbps)
Table 4-13 Total Encoding Time of Fast Full Pel Search and ETA Total Encoding Time (Second)
4.6 Content-Aware Fast Motion Estimation Algorithm (CAFME)
In this section, we integrate the simple dynamic search range (SDSR), successive elimination algorithm with integral frame (SEAIF), and early termination algorithm (ETA) to form the Content-Aware Fast Motion Estimation Algorithm (CAFME).
In the Table 4-14, Table 4-15, and Table 4-16, the number of search points can be reduced
more than 90% in most of the sequences. Especially, for the slow and median motion, the reduced rates of search points are about 99%. For high motion, the reduced rates of search points should be lower. The reduced rate of search points is 73.8% for football sequence. In average, the increment of bit rate in CAFME is very small, about 0.26%. The total encoding time is reduced about 41.9%, and the number of SP is reduced about 93.1%.
Table 4-14 Search Points of FS and CAFME Number of Search Points
Sequence Name
Fast Full Pel Search CAFME
Improvement
Table 4-15 Bitrates of FS and CAFME Bitrates (Kbps)
Sequence Name
Fast Full Pel Search CAFME
Improvement
Foreman QCIF 69.203 69.118 - 0.12%
Mobile QCIF 173.016 173.285 + 0.16%
Coastguard QCIF 76.134 75.862 - 0.36%
Foreman CIF 188.773 188.784 + 0.005%
Tempete CIF 425.392 425.955 + 0.13%
Table 4-16 Total Encoding Time of Fast Full Pel Search and CAFME Total Encoding Time (Second)
Sequence Name
Fast Full Pel Search CAFME
Improvement
The proposed Simple Dynamic Search Range (SDSR) can reduce the number of search points about 80% while sustaining the coding efficiency (bitrate increases 0.11% in average). We also integrate the Successive Elimination Algorithm with Integral Frame (SEAIF) and the Early Termination Algorithm (ETA) with SDSR to form the Content-Aware Fast Motion Estimation Algorithm (CAFME). The CAFME improves the SDSR and the number of search points is reduced
to 93.1% while the bit rate increases just a little (0.26%). The overall encoding time is reduced about 41.9% in our implementation.
Chapter 5
Conclusions and Future Works
The motion estimation plays an important role in the video compression. However, motion estimation module is usually the most computational intensive part in a typical video encoder.
Hence, the efficient motion estimation algorithm is needed. We proposed a fast algorithm called Content-Aware Fast Motion Estimation Algorithm (CAFME). CAFME consists of the Simple Dynamic Search Range (SDSR), Successive Elimination Algorithm with Integral Frame (SEAIF), and Early Termination Algorithm (ETA). The SDSR adjusts the search range for every block adaptively. The SEAIF reduces the number of computation of SAD without loss. The ETA terminates the search process early when finding a good candidate block.
The SDSR need not predefined any threshold predefined and perform well for all the test sequences. The SEAIF is designed for overlapped variable block size and applies reusing techniques. The performance of ETA is good and stable for all kinds of motion activity.
The experimental results show that CAFME can reduce the number of search point about 93.1% and the bitrate only increases 0.26% while sustaining the same PSNR. We modify H.264/AVC reference software JM 9.4 and implement our proposed algorithms on it. The total encoding time reduces about 41.9%.
The motion search algorithm currently used in CAFME is full search (FS). However it may be replaced by any fast motion estimation algorithm like TSS and DS, etc. The future works may be to develop a fast motion estimation algorithm suitable for dynamic search range, alleviate the overhead in implementation, and so on.
Bibliography
[1] T. Koga, K. Iinuma, A. Hirano, Y. Iijima, and T. Ishiguro, “Motion Compensated Interframe Coding for Video Conferencing” Proc. Nat. Telecommun. Conf., pp. G5.3.1–5.3.5, New Orleans, LA, Nov. 29–Dec. 3 1981.
[2] S. Zhu and K.-K. Ma, “A New Diamond Search Algorithm for Fast Block-Matching Motion Estimation”, IEEE Trans. on Image Processing, Volume 9, Issue 2, pp. 287–290, Feb. 2000.
[3] B. Liu and A. Zaccarin, “New Fast Algorithms for the Estimation of Block Motion Vectors”, IEEE Trans. on Circuits System Video Technology, Volume 3, pp. 148–157, Apr. 1993.
[4] V.-A. Nguyen and Y.-P. Tan, “Fast Block-Based Motion Estimation Using Integral Frames”, IEEE Signal Processing Letters, Volume 11, Issue 9, pp. 744–747, Sep. 2004.
[5] R. Li, B. Zeng, and M.-L. Liou, “A New Three-Step Search Algorithm for Block Motion Estimation”, IEEE Trans. on Circuits and Systems for Video Technology, Volume 4, Issue 4, pp. 438–442, Aug. 1994.
[6] J. Jain and A. Jain, “Displacement Measurement and Its Application in Interframe Image Coding” IEEE Trans. on Communications, Volume COMM-29, pp. 1799–1808, Dec. 1981.
[7] L.-M. Po and W.-C. Ma, “A Novel Four-Step Search Algorithm for Fast Block Motion Estimation” IEEE Trans. on Circuits and Systems for Video Technology, Volume 6, Issue 3, pp. 313–317, Jun. 1996.
[8] C.-H. Cheung and L.-M. Po, “A Novel Cross-Diamond Search Algorithm for Fast Block Motion Estimation” IEEE Trans. on Circuits and Systems for Video Technology, Volume 12, Issue 12, pp. 1168–1177, Dec. 2002.
[9] C.-W. Lam, L.-M. Po, and C.-H. Cheung, “A New Cross-Diamond Search Algorithm for Fast Block Matching Motion Estimation” 2003 International Conf. on Neural Networks and Signal Processing, Volume 2, pp. 1262-1265, Dec. 14-17 2003.
[10] H. Jia and L. Zhang, ”A New Cross Diamond Search Algorithm for Block Motion Estimation” Proc. of IEEE International Conf. on Acoustics, Speech, and Signal Processing, Volume 3, pp. iii-357-60, May 17-21 2004.
[11] C. Zhu, X. Lin, L. Chau, and L.-M. Po, “Enhanced Hexagonal Search for Fast Block Motion Estimation” IEEE Trans. on Circuits and Systems for Video Technology, Volume 14, Issue 10, pp. 1210–1214, Oct. 2004.
[12] Y. Nie and K.-K. Ma, “Adaptive Rood Pattern Search for Fast Block-matching motion estimation” IEEE Trans. on Image Processing, Volume 11, Issue 12, pp. 1442–1449, Dec.
2002.
[13] K.-K. Ma and G. Qiu, “Unequal-Arm Adaptive Rood Pattern Search for Fast Block-Matching Motion Estimation in the JVT/H.26L” 2003 International Conf. on Image Processing, Volume 1, pp. I-901-4, Sep. 14-17 2003.
[14] K.-K. Ma and G. Qiu, “An Improved Adaptive Rood Pattern Search for Fast Block-Matching Motion Estimation in JVT/H.26L” Proc. of the 2003 International Symposium on Circuits and Systems, Volume 2, pp. II-708 - II-711, 25-28 May 2003.
[15] Y.-C. Lim, K.-Y. Min, and J.-W. Chong, “A Pentagonal Fast Block Matching Algorithm for Motion Estimation Using Adaptive Search Range” IEEE International Conf. on Acoustics, Speech, and Signal Processing, Volume 3, pp. III - 669-72, Apr. 6-10 2003.
[16] W. Li and E. Salari, “Successive Elimination Algorithm for Motion Estimation” IEEE Trans.
on Image Processing, Volume 4, Issue 1, pp. 105–107, Jan. 1995.
[17] Digital Video Coding Group, ITU-T Recommendation H.263 Software Implementation, Telenor R&D, 1995.
[18] M. Yang, H. Cui, and K. Tang, “Efficient Tree Structured Motion Estimation Using Successive Elimination” IEE Proc. on Vision, Image and Signal Processing, Volume 151, Issue 5, pp. 369–377, Oct. 30 2004.
[19] Yu-Wen Huang, Shao-Yi Chien, Bing-Yu Hsieh, and Liang-Gee Chen, “Global Elimination Algorithm and Architecture Design for Fast Block Matching Motion Estimation” IEEE Trans.
on Circuits and Systems for Video Technology, Volume 14, Issue 6, pp. 898–907, Jun. 2004.
[20] X.Q. Gao, C.J. Duanmu, and C.R. Zou, “A Multilevel Successive Elimination Algorithm for Block Matching Motion Estimation” IEEE Trans. on Image Processing, Volume 9, Issue 3, pp.
501–504, Mar. 2000.
[21] L.-W. Lee, J.-F. Wang, J.-Y. Lee, and J.-D. Shie, ”Dynamic Search-Window Adjustment and Interlaced Search for Block-Matching Algorithm” IEEE Trans. on Circuits and Systems for Video Technology, Volume 3, Issue 1, pp. 85–87, Feb. 1993.
[22] J. Feng, K.-T. Lo, H. Mehrpour, and A.E. Karbowiak, “Adaptive Block Matching Motion Estimation Algorithm for Video Coding” IEE Electronics Letters, Volume 31, Issue 18, pp.
1542–1543, Aug. 31 1995.
[23] H.-S. Oh and H.-K. Lee, “Adaptive Adjustment of the Search Window for Block-Matching Algorithm with Variable Block Size,” IEEE Trans. on Consumer Electronic, Volume 44, No.
3, pp. 659-666, Aug. 1998.
[24] L.-K. Liu, “Dynamic Search Range Motion Estimation for Video Coding” IEEE First Workshop on Multimedia Signal Processing, pp. 207–212, Jun. 23-25 1997.
[25] H.-M. Kim and T. Acharya, “CAS: Context Adaptive Search for Motion Estimation” Proc. of International Conf. on Information Technology Coding and Computing, pp. 202–206, Apr.
2-4 2001.
[26] J. Minocha and N.-R. Shanbhag, “A Low Power Data-Adaptive Motion Estimation Algorithm” IEEE 3rd Workshop on Multimedia Signal Processing, pp. 685–690, Sep. 13-15 1999.
[27] S. Saponara and L. Fanucci, ”Data-Adaptive Motion Estimation Algorithm and VLSI Architecture Design for Low-Power Video Systems” IEE Proc. on Computers and Digital Techniques, Volume 151, Issue 1, pp. 51–59, Jan. 15 2004.
[28] P.-I. Hosur, “Motion Adaptive Search for Fast Motion Estimation” IEEE Trans. on Consumer Electronics, Volume 49, Issue 4, pp. 1330–1340, Nov. 2003.
[29] S.-S. Lin, P.-C. Tseng, C.-P. Lin, and L.-G. Chen, “Multi-Mode Content-Aware Motion Estimation Algorithm for Power-Aware Video Coding Systems” IEEE Workshop on Signal Processing Systems, pp. 239–244, 13-15 Oct. 2004.
[30] K.-P. Lim, G. Sullivan, and T. Wiegand, “Text Description of Joint Model Reference
[30] K.-P. Lim, G. Sullivan, and T. Wiegand, “Text Description of Joint Model Reference