Experiment Environment - Experiment Results

Chapter 4 Experiment Results

4.1 Experiment Environment

In this thesis, we implemented the object segmentation algorithm and bit allocation method by modifying the H.264 reference software JM 9.5[39] and the original version was used to comparison purposes. All experiments were conducted on the PC with an Intel Pentium 4 CPU 2.4 GHz and 256 MB of RAM.

Our experimental work uses the following approach:

1) First frame is intra-coded and others are P-frames.

2) Macroblock type only use 16 × 16

3) Original version is adopted the multi-resolution motion estimation.

4.2 Experimental Results

We had experimented with five sequences: “Football” and “Stefan” of SIF format (352 × 240) and “Foreman”, “Mother and daughter”, and “Hall” of CIF format (352 × 288). According to the sequence type, we encoded the Football and Stefan sequence with 500 kbps because their high motion. Foreman and Mother were encoded with 100 kbps because their obvious foreground region. And we had compressed the Hall sequence with 50 kbps because its static scene with small foreground objects.

Original Version Modified Version AVG PSNR

FG (dB)

AVG PSNR BG (dB)

Bitrates (kbps)

AVG PSNR FG (dB)

AVG PSNR BG (Db)

Bitrates (kbps)

Football 23.73 25.52 513.33 25.73 24.44 557.58

Stefan 28.77 29.94 510.48 30.17 28.47 541.58

Foreman 27.73 27.87 111.01 28.85 26.82 114.47

Mother and daughter

33.53 36.48 109.23 34.47 35.66 110.48

Hall 24.94 32.30 51.37 26.05 30.99 51.26

Table 4-1 Encoded Results of five sequence of JM original version and original version

First, with encoding a sequence which has only one obvious object, we use the CIF format sequence “Foreman” for example. We had encoded the sequence with 100 kbps and compare the original version without bit allocation and the modified version with our proposed method. First, by comparing the object quality, we can see that our method have improved the average quality of foreground object region In Fig.4-1(a).

Second, by comparing the generally quality, the average PSNR of the foreground was improved by 1.12 dB in Table 4-1, whereas the background quality was degraded by 1.05 dB. Finally, by comparing the two encoded images shown in Fig. 4-2, we can clearly see that the quality of facial region was much improved and its bit-rate only increase 0.03%.

20 22 24 26 28 30 32 34 36

1 11 21 31 41 51 61 71 81 91 101 111 121 131 141 151 161 171 181 191

P Frame No.

PSNR (dB)

M_PSNR_FG Org_PSNR_FG

(a)

20 22 24 26 28 30 32 34

1 11 21 31 41 51 61 71 81 91 101 111 121 131 141 151 161 171 181 191

P Frame No.

PSNR (dB)

M_PSNR_BG Org_PSNR_BG

(b)

30 35 40 45 50 55

1 11 21 31 41 51 61 71 81 91 101 111 121 131 141 151 161 171 181 191

P Frame No.

QP Value

QP_FG QP_BG Org_Avg_QP

(c)

Fig. 4-1 Foreman sequence encoded by original JM software and modified version: (a) Average PSNR of foreground region in original and modified version, (b) Average PSNR of background region in original and modified version, (c) Average QP value of foreground/background in modified version and average QP value in original version.

(a) Segmented Result

(b) Original version.

(c) Modified version.

Fig. 4-2 Results of Foremen sequence encoded by (a) segmented result (b) original version and (c) Modified version JM.

Second, with the sequence which has static scene and small foreground objects, such as CIF format source “Hall” sequence, we had encoded the sequence with 50 kpbs. First, we see the PSNR that in our method of the foreground region is still worse than the background in Fig. 4-3 (a). But in Fig.4-3 (b), we can see that in the original JM, PSNR of the foreground is already worse than background. And in Fig. 4-3 (c), (d) and Table 4-1, we clearly see that we had improved the quality of the foreground region with 1.11 dB whereas the background quality was only degraded by 1.31 dB.

In Fig 4-3 (a), (b) and (c), the value zero of the PSNR of foreground is because there has no foreground objects had been segmented. In frames 1 to 16, there are really no objects in the scene, but in other frames, objects are not segmented because these frames are too similar to previous frame and there are no motion information can be

used in the object regions. The subjective quality is shown as Fig. 4-4.

20 22 24 26 28 30 32 34 36

1 11 21 31 41 51 61 71 81 91 101 111 121 131 141 151 161 171 181 191

P Frame No.

PSNR (dB)

M_PSNR_FG M_PSNR_BG

(a)

20 22 24 26 28 30 32 34 36

1 11 21 31 41 51 61 71 81 91 101 111 121 131 141 151 161 171 181 191

P Frmae No.

PSNR (dB)

Org_PSNR_FG Org_PSNR_BG

(b)

20 22 24 26 28 30 32 34

1 11 21 31 41 51 61 71 81 91 101 111 121 131 141 151 161 171 181 191

P Frame No.

PSNR (dB)

M_PSNR_FG Org_PSNR_FG

(c)

27 28 29 30 31 32 33 34

1 11 21 31 41 51 61 71 81 91 101 111 121 131 141 151 161 171 181 191

P Frame No.

PSNR (dB)

M_PSNR_BG Org_PSNR_BG

(d)

Fig. 4-3 Hall sequence encoded by original JM software and modified version: (a) PSNR of foreground/background region of modified version, (b) PSNR of foreground/background region of original version, (c) PSNR of foreground region in original and modified version, (d) PSNR of background region in original and modified version.

(a) Segmented Result

(b) Original version

(c) Modified version

Fig. 4-4 Results of Hall sequence encoded by (a) segmented result (b) original version and (c) modified version JM.

With the sequence which has high motion and multi objects, such as SIF format source “Football” sequence, we had improve the foreground objects’ quality as shown in Fig. 4-5 (a). Since the motion is complex in the “Football” Sequence, the bits allocated to the frame is changed not so stably, so the quality of the foreground region is changed a lot in every frame in our modified method. But it is still better than the original method. From Table 4-1, we can know that we had improved the foreground regions’ quality with 2dB whereas the background quality was only degraded by 1.08 dB. And the bit-rate of our method increases only 0.09% due to the complex foreground of the sequence. The subjective quality is shown in Fig. 4-6.

22 23 24 25 26 27 28 29

1 10 19 28 37 46 55 64 73 82 91 100 109 118

P Frame No.

PSNR (dB)

M_PSNR_FG Org_PSNR_FG

(a)

22 23 24 25 26 27 28 29

1 10 19 28 37 46 55 64 73 82 91 100 109 118

P Frame No.

PSNR (dB)

M_PSNR_BG Org_PSNR_BG

(b)

Fig. 4-5 Football sequence encoded by original JM software and modified version:

(a) PSNR of foreground region in original and modified version, (b) PSNR of background region in original and modified version.

In the above experimental results, we know that our method can successfully

improve the quality of foreground moving objects with degradation the quality of the background. In high motion and complex objects scenes, it will cause a certain increasing bitrates to trade the quality of the moving objects. In the sequence that has obvious objects or static scene with small objects, our method is still working well.

(a) Segmented Result

(b) Original version

Fig. 4-6 Results of Football sequence encoded by (a) segmented result (b) original version and (c) modified version JM.

Chapter 5 Conclusion and Feature Work

In this thesis, we have presented a motion-based object segmentation algorithm and an object-based rate control scheme. In order to improve the quality of the regions that people are interested in as compared to the background within a limited bit rate, the object segmentation part is first used to segment foreground objects with similar motion activity. When objects have been segmented, the characteristics of these objects, such as size and motion activity, are used to measure the importance of these objects. Then, in the rate control scheme that integrates the feature-based bit allocation we distribute bits to different regions according to its importance.

To improve the performance and the robustness of the system, some enhancements can be worked on:

• Improving the segmentation algorithm so that it is more robust to lighting variation and complex scenes by using other features, such as color information.

• Considering human visual system, the bits which are allocated to the background region can be distributed perceptually by the distance to the foreground region.

• The rate control scheme at the frame level can be improved by considering different complexity of foreground and background.

Video coding with achieving a better foreground quality as compared to the background within a limited transferring rate is an important research topic. We construct the object segmentation and bit allocation scheme for the purpose. For

future enhancement, this scheme can fit to achieve better visual quality for human eye in a limited channel rate.

Reference

[1] “Draft ITU-T recommendation and final draft international standard of joint video specification (ITU-T Rec. H.264/ISO/IEC 14486-10 AVC)”, in Joint Video Team (JVT) of ISO/IEC MPEG and ITU-T VCEG Document, JVT-G050, March 2003.

[2] Iain E. G. Richardson, H.264 and MPEG-4 Video Compression, Wiley, 2003 [3] T. Wiegand, G. J. Sullivan, G. Bjontegaard, A. Luthra, ”Overview of the

H.264/AVC Video Coding Standard”, IEEE Transactions on Circuits and System for Video Technology, Vol. 13, Issue 7, pp. 560-676, July 2003.

[4] T. Aach, A. Kaup and R. Mester, “Statistical Model-Based Change Detection in Moving Video”, Signal Processing, Vol.31, No. 2, pp.203-217, 1993.

[5] A. Neri, S. Colonnese, G. Russo, and P. Talone, “Automatic moving object and background separation”, Signal Processing, Vol.66, pp.219-232, 1998.

[6] D. D. Giusto, F. Massidda, and C. Perra, “A Fast Algorithm for Video

Segmentation and Object Tracking”, International Conference on Digital Signal Processing, Vol.2, pp.697 – 700, 2002.

[7] Shao-Yi Chien, Shyh-Yih Ma, and Liang-Gee Chen, “Efficient Moving Object Segmentation Algorithm Using Background Registration Technique”, IEEE Transactions on Circuits and Systems for Video Technology, Vol.12, No. 7, pp.

577 – 586, 2002.

[8] Jinhui Pan, Chia-Wen Lin, Chuang Gu, and Ming-Ting Sun, “A Robust Video Object Segmentation Scheme with Pre-stored Background Information”, IEEE International Symposium on Circuits and Systems, Vol.3, pp.803 – 806, 2002.

[9] Yaakov Tsaig and Amir AverBuch, “Automatic Segmentation of Moving Objects in Video Sequence: A Region Labeling Approach”, IEEE Transactions on

Circuits and Systems for Video Technology, Vol.12, No. 7, pp.597 – 612, 2002.

[10] J.C Choi, S.-W Lee, and S. –D. Kim, “Spatio-Temporal Video Segmentation Using a Joint Similarity Measure”, IEEE Transactions on Circuits and Systems for Video Technology, Vol.7, No. 2, pp. 279 – 286, 1997.

[11] D. Wang, “Unsupervised Video Segmentation Based on Watersheds and Temporal Tracking”, IEEE Transactions on Circuits and Systems for Video Technology, Vol.8, No. 5, pp. 539 – 546, 1998.

[12] Hieu T. Nguyen, Marcel Worring, and Anuj Dev, “Detection of Moving Objects in Video Using a Robust Similarity Measure”, IEEE Transactions on Image Processing, Vol.9, No. 1, pp.137 – 141, 2000.

[13] M. L. Jamrozik and M.H. Hayes, “A Compressed Domain Video Object

Segmentation System”, Proceedings of 2002 International Conference on Image Processing, Vol. 1, pp. 113-116, Sept. 2002.

[14] G. Agarwal, A. Anbu and A. Sinha, “A Fast Algorithm To Find The

Region-Of-Interest In The Compressed MPEG Domain”, Proceedings of 2003 International Conference on Multimedia and Expo, Vol. 2, pp. 133-136, July 2003.

[15] A. Anbu, G. Agarwal and G. Srivastava, “A Fast Object Detection Algorithm Using Motion-Based Region-Of-Interest Determination”, 14^th International Conference on Digital Signal Processing, Vol. 2, pp. 1105-1108, July 2002.

[16] Hui-Ping Kuo, “Object-based Video Tracking and Abstraction on Surveillance videos”, NCTU CSIE, June 2004.

[17] Yi-Wen Chen, Duan-Yu Chen and Suh-Yin Lee, “Moving Object Tracking for video Surveillance in Compressed Videos”, in The 7^th International Conference on Internet and Multimedia Applications and Systems, pp. 695-698, Aug. 2003.

MPEG-4 Video”, IEEE Transactions on Circuits and System for Video Technology, Vol. 10, Issue 6, pp. 878-894, Sept. 2000

[19] Vetro, A, Huifang Sun and Yao Wang, “MPEG-4 rate control for multiple video objects”, IEEE Transactions on Circuits and System for Video Technology, Vol.

9, Issue 1, pp. 186-199, Feb. 1999.

[20] Ribas-Corbera J. and S. Lei, “Rate control in DCT video coding for low-delay communications”, IEEE Transactions on Circuits and System for Video Technology, Vol. 9, Issue 1, pp. 172-185, Feb. 1999.

[21] MPEG-2 Test Model 5, Doc. ISO/IEC JTC1/SC29 WG11/93-400, Apr. 1993.

[22] Z. G. Li, X. Lin, C. Zhu and F. Pan, “A Novel Rate Control Scheme for Video Over the Internet”, Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing 2002, Vol. 2, pp. 2065-2068, May 2002.

[23] S. W. Ma, W. Gao, Y. Lu and H. Q. Lu, “Proposed draft description of rate control on JVT standard”, JVT-F086, 6^th meeting, Awaji, Japan, Dec. 2002.

[24] S.W. Ma, W. Gao, P. Gao and Y. Lu, “Rate Control for Joint Video Team (JVT) Standard”, JVT-D030, in 4^th meeting: Klagenfurt, July 2003.

[25] Z. G. Li, F. Pan, K. P. Lim, G. N. Feng, X. Lin and R. Susanto, “Adaptive basic unit layer rate control for JVT”, JVT-G012, in 7^th meeting: Pattaya, March 2003.

[26] Z. G. Li, W. Gao, F. Pan, S. W. Ma, K. P. Lim, G. N. Feng, X. Lin, R. Susanto, Y.

Lu and H. Q. Lu, “Adaptive rate control with HRD consideration”, JVT-H014, in 8^th meeting: Geneva, May 2003.

[27] Z. G. Li, F. Pan, K. P. Lim, X. Lin and S. Rahardja, “Adaptive Rate Control For H.264”, 2004 IEEE International Conference on Image Processing (ICIP), Vol. 2, pp. 745-748, Oct. 2004.

[28] F. Pan, Z. Li, K. Lim, and G. Feng, “A study of MPEG-4 rate control scheme

Technology, Vol. 13, Issue 5, pp. 440-446, May 2003.

[29] K. Ngan, T. Meier and Z. Chen, “Improved Single-Video-Object Rate Control for MPEG-4”, IEEE Transactions on Circuits and System for Video Technology, Vol. 13, Issue 5, pp. 385-393, May 2003.

[30] M. Jiang, X. Li and N. Ling, “Improved Frame-Layer Rate Control For H.264 Using MAD Ratio”, Proceedings of the 2004 International Symposium on Circuits and Systems, Vol. 3, pp. 813-816, May 2004.

[31] X. Yi and N. Ling, “Rate Control Using Enhanced Frame Complexity Measure For H.264 Video”, 2004 IEEE Workshop on Signal Processing Systems, pp.

263-268, Oct. 2004.

[32] H. Yu, F. Pen and Z. Lin, “A New Bit Estimation Scheme for H.264 Rate Control”, 2004 IEEE International Symposium on Consumer Electronics, pp.

396-399, Sept. 2004.

[33] Chun-Huang Lin and Ja-Ling Wu, “Content-Based Rate Control Scheme for Very Low Bit-Rate Video Coding”, IEEE Transactions on Consumer Electronics, Vol. 43, No. 2, May 1997.

[34] S. Aramvith, H. Kortrakulkij, et al., ”Joint Source-Channel Coding using Simplified Block-Based Segmentation and Content-based Rate-Control for wireless Video Transport”, Proceeding of International Conference on

Information Technology: Coding and Computing 2002, Las Vegas, pp. 71-76, April 2002.

[35] W. Lai, X. D. Gu, R. H. Wang, W. Y. Ma and H. J. Zhang, “A Content-based Bit Allocation Model for Video Streaming”, 2004 IEEE International Conference on Multimedia and Expo (ICME), Vol. 2, pp. 1315-1318. June 2004.

[36] Y. Sun, D. Li, I. Ahmad and J. Luo, “A Rate Control Algorithm for Wireless

Information Technology: Coding and Computing (ITCC), Vol. 1, pp.109-114, April 2005.

[37] D. Chai, K. N. Ngan and A. Bouzedoum, “Foreground/Background Bit

Allocation For Region-Of-Interest Coding”, Proceedings of 2000 International Conference on Image Processing, Vol.2, pp. 923-926, Sept. 2000.

[38] S. Sengupta, S. K. Gupta and J. M. Hannah, “Perceptually Motivated Bit

Allocation for H.264 Encoded Video Sequences”, 2003 International Conference on Image Processing, Vol. 2, pp. 797-800, Sept. 2003.

[39] H.264 reference software JM 9.5, http://iphome.hhi.de/suehring/tml/, 2005 [40] Chi-Tsong Chen. Linear system theory and design. Rinehart and Winston, New

York, 1984.

在文檔中以特徵為基礎的視訊編碼位元配置結構 (頁 46-0)