CHAPTER 2 THE PROPOSED METHOD
2.4 The proposed algorithm
2.4.5 The overall algorithm
Fig. 7 shows the flowchart of the overall proposed algorithm. First, we apply still area detection to decide if a MB is located in still area. If the MB satisfied the rules, then set best mode of this MB to be 16x16 or one of 16x16, 16x8 or 8x16 corresponding to different conditions, and the computation for the others modes are skipped. Otherwise, we will do ME using the 16x16 block-size mode to create the residual block. If the number of white
points in the residual block is less than threshold Tnum, we will set the best mode to 16x16 and skip the other modes. Otherwise, we first split the MB into four 8x8 blocks. Then, do ME using 8x8 block-size mode for each of them and enter the third step. For each MB, we check which case is satisfied. If one of the first three cases is satisfied, the mode is decided.
Otherwise, enter to the final step. For each 8x8 blocks, we split 8x8 block into four 4x4 blocks and do ME using 4x4 mode to get motion vectors of 4x4 blocks. Then, we check four cases to see which case is satisfied. And the best mode is decided for the 8x8 block.
The mode obtained through above process will be called P8x8 for convenient illustration.
After the mode decision of each 8x8 block, we sum the prediction error of the four 8x8 blocks. Then, we do ME using 16x8 and 8x16 modes, and select the mode with minimum prediction error from the 16x16, 16x8, 8x16 and p8x8 modes as the best mode for the MB.
14
MAD(MB) ≦ T1?
ME using 16x16 mode
WP NUM≦Tnum?
ME using 4x4 block-size
Satisfy the MV relation rule ? All four 8x8 blocks are checked?
No
Yes No
Yes
ME using 16x8, 8x16 modes
Comparing with the costs of 16x16, 16x8, 8x16 and/or P8x8 to
find the best mode which has min cost MAD(8x8)≦ T
All
1?
Yes
No
ME using 8x4, 4x8 modes and select the best one to
be mode of 8x8 block
Fig. 7 The flowchart of overall proposed algorithm in our thesis.
CHAPTER 3
EXPERIMENTAL RESULTS
In this chapter, we will present the simulation results that are derived from implementing the proposed algorithm for inter-frame mode decision in H.264. Further, we will make comparison with methods surveyed in this thesis.
Here, we use three factors, time-saving rate, PSNR and bit-rate to do comparisons. The simulator is based on the Joint Model version 8.2 (JM8.2)[7]encoder that provided by JVT.
In the following tables and figures, the “Method 1” means that we only use the first step to reduce candidate modes for those MBs, their modes are not decided in the first step, the exhaustive search is applied. “Method 1+2” means we use first two steps to reduce candidate modes. “Method 1+2+3” means that we adopt the first three steps to reduce candidate modes.
3.1 Comparison with Yu’s algorithm[4]
Yu proposed a fast approach for inter mode decision based on the homogeneous area detection. They classified the block-size modes into three categories, {16x16}, {16x16, 16x8, 8x16} and {all possible modes} corresponding to different level of homogeneity. The speed-up-rate of encoding time lied in between 17% ~ 32% for each video sequence, the increasing rate of bit-rate is about 3.15% in average. Comparing to our proposed algorithm (see Fig. 8), we have surpassed the result of Yu’s proposed method even double the encoding efficiency in time and only have a little increase of bit-rate.
16
(a)
(b)
Fig. 8 Comparison between our method and Yu’s. (a) Comparison in “Speed-Up Rate.” (b) Comparison in “Increasing Ratio of Bit-rate.”
3.2 Comparison with Wu’s algorithm[5]
Wu’s algorithm only uses the information of homogeneous and still area to do candidate mode reduction. For some video sequences, like “News”, it actually can get higher time saving rate and lower bit-rate increase. However, for some video types without the homogeneous or still area, like “Mobile” and “Stefan” sequences, Wu’s algorithm can not get good result. And for those video sequences with camera motion or steady-motion
objects, such as “Foreman” sequence, Wu’s method does not work well. By contrast, for those video with camera-motion or steady-motion objects, we can use residual image judgment to reduce the candidate modes. Motion-vectors relation also provides a good feature to do candidate mode reduction for all video types. The comparison results are shown in Fig. 9. We can see that the result of our algorithm have overtaken 21% (Mobile sequence) at least in time-saving rate and outshined the result of Wu’s algorithm 9.97%
(Mobile sequence).
(a)
(b)
Fig. 9 Comparison between our method and Wu’s. (a) Comparison in “Speed-Up Rate.” (b) Comparison in “Increasing Ratio of Bit-rate.”
18
3.3 Comparison with Jing-Chau’s algorithm[6]
The method proposed by Jing and Chau uses the MAD to determine if a MB is homogeneous or not. Similar to the previous mentioned methods[4, 5], the algorithm could not get good results for some video types. Fig. 10 shows the comparison result. We can see that the time-saving rate of our proposed scheme could outstrip 35% for all examined sequences.
(a)
(b)
Fig. 10 Comparison between our method and Jing’s. (a) Comparison in “Speed-Up Rate.” (b) Comparison in “Increasing Ratio of Bit-rate.”
3.4 Summary and analysis
The proposed algorithm consists of three major concepts, which are used to reduce the candidate modes. By the demonstration of simulation result, we can see that the proposed method has noticeable improvement in encoding time saving. Evidently, the performance of each step is influenced by the video types. For example, for several video sequences, such as “Silent”, “News” and etc., the frames have the same background, and the artist is almost still except a little motion of face and body. The first step – still area detection will be useful at this condition and has good response on coding efficiency. Another, for the sequence with simple camera motion or steady movement of objects, like “Foreman”, ”Stefan”
sequences, the second step - residual image judgment, has better response. Finally, the third and final step using motion vectors relation analysis is suitable for majority of video types.
In summary, the improved efficiency of the proposed algorithm in time-saving rate is about 50% in average with a little increase of bit-rate. However, if we will lay more stress on the increasing rate of bit-rate, we can only use the first few steps (1, or 1+2, or 1=2+3) that influence the bit-rate slightly instead of using the whole algorithm adoption. And it can also keep the improvement of coded time saving. On the contrary, if the reduction of time complexity is the main achievement, the whole proposed algorithm is the best choice to improve the encoded efficiency of H.264 encoder.
Table. 2 is the simulation environmental parameters used in the experiments, and the other parameters not mentioned here follow the setting of the main profile provided by JM8.2 encoder. The “IPPP” structure here denotes that only the first frame of total encoded sequence is Intra-frame, the remaining frames are all of Inter-frame, P-frame, we do not use B frame here. In our proposed algorithm, we set Tstill=4, Twp=9 and Tnum=7.
20
Table. 2 .Environmental Parameters
Parameter Name Value
Frame Rate 30 frame/sec
Quantization Parameter 28
Search Range ±16
Reference Frame Number 5
RD-Optimization Disable
Entropy Coding CABAC
MV resolution 1/4
GOP type IPPP
Table. 3 and Table. 4, list the simulation result of the proposed methods at each step and the whole proposed algorithm with several video sequences. From these tables, we could be clear acquainted with the fact we described before.
Table. 3 The simulative result includes reduced encoding time rate, dropping value of PSNR and increasing ratio of bit-rate. Method 1 means only adopt the first step. Method 1+2 means adopt the first and second steps.
Method 1 Method 1+2
Sequence
Time(%) PSNR(dB) Bit-rate(%) Time(%) PSNR(dB) Bit-rate(%)
Foreman (Qcif) 20.95 0.0 0.47 39.29 0.02 0.50
Table. 4 The simulation result including reduced encoding time rate, dropping value of PSNR and increasing ratio of bit-rate. Method 1+2+3 means adopt the first, second and third steps.
Method 1+2+3 Proposed Algorithm
Sequence
Time (%) PSNR(dB) Bit-Rate(%) Time (%) PSNR(dB) Bit-Rate(%)
Foreman (Qcif) 40.48 0.01 1.68 44.65 0.03 1.92
Carphone (Qcif) 46.49 0.06 1.08 49.35 0.04 1.15
News (Qcif) 54.34 0.02 1.33 56.72 0.02 2.30
Container (Qcif) 58.12 0.01 -0.05 61.69 0 0.58
M’s America (Qcif) 64.83 -0.01 0.54 65.44 0.01 0.80
Paris (Cif) 47.32 0.01 1.32 50.42 0.02 2.25
Akiyo (Cif) 69.18 0.01 1.07 69.69 0.03 1.46
Mobile (Cif) 15.70 -0.01 0.76 25.04 0 1.24
Stefan (Cif) 27.53 0.01 0.94 32.35 0.02 1.52
Table (Qcif) 45.82 0.02 1.21 49.21 0.03 1.90
Average 46.98 0.013 0.99 50.46 0.020 1.51
22
CHAPTER 4 CONCLUSIONS
In this thesis, we have proposed a new algorithm to improve the searching speed for VBS selection in ME part of H.264 encoder. It consists of three major techniques, still area detection, residual-image judgment and analysis of motion vectors relation. By the use of these three techniques, the proposed algorithm can be more suitable for majority of video types. The experimental result shows that an efficient use of these techniques with simple criteria or rules in both MB-level and sub-MB-level modes can reduce number of the candidate modes and get a significant improvement in time saving. The proposed algorithm can get encoding time saving rate about 50.46% in average while keeping similar quality and bit-rate.
REFERENCES
[1] “Draft ITU-T recommendation and final draft international standard of joint video specification ITU-T Rec. H.264/ISO/IEC 14 496-10 AVC,” in Joint Video Team (JVT) of ISO/IEC MPEG and ITU-T VCEG, JVT-G050, 2003.
[2] T. Wiegand, Gary J. Sullivan, Gisle Bjontegaard, Ajay Luthra, “Overview of the H.264/AVC video coding standard,” IEEE Trans. Circuits Syst. Video Technol., Vol.
13, pp. 560–570, July 2003.
[3] Iain E. G. Richardson, “H.264/MPEG-4 Part 10 white paper: Prediction of inter macroblock in P-slices,” http://www.vcodex.com , Spring, 2003.
[4] A. C. Yu, “Efficient block-size selection algorithm for inter-frame coding in H.264/MPEG-4 AVC,” Acoustics, Speech, and Signal Processing, 2004. Proceedings.
(ICASSP '04), Vol. 3, pp. iii-169-172, 17-21 May 2004.
[5] D. Wu, S. Wu, K. P. Lim, F. Pan, Z. G. Li, X. Lin, “Block inter mode decision for fast encoding of H.264,” Acoustics, Speech, and Signal Processing, 2004. Proceedings.
(ICASSP '04), Vol. 3, pp. iii-181-184, 17-21 May 2004.
[6] X. Jing, L.-P. Chau, “Fast approach for H.264 inter mode decision,” IEE Electronics Letters, Vol. 40, Issue 17, pp. 1050-1052, 19 Aug. 2004.
[7] Joint Video Team (JVT), Reference Software “Joint Model Version 8.2,”
http://iphome.hhi.de/suehring/tml/download/old_jm/jm82.zip.
24