Discussion - Rate Control Algorithm Based on HVS

Chapter 5 Rate Control Algorithm Based on HVS

5.5 Discussion

The proposed rate control algorithm can provide better visual quality, especially when there is a large and flat region in the test frames, such as the ocean in test frame I. But sometimes the visual quality of edges may become worse. The reason is that the visual weighting for high spatial frequency is smaller than the value it should have.

Because we use “human visual weighting error” instead of “quantization error” to do rate control, PSNR will become smaller. It proves that the frame with higher PSNR may not have higher visual quality. The weighting factor will make the relative difference between the R-D slopes of associated truncation points in MSB bitplane and those of associated truncation points in LSB bitplane larger. Thus, we need higher bit rate to package the same data.

Because the human vision has high sensitivity at low spatial frequency (flat region) than high spatial frequency (edge), the proposed rate control algorithm packages more data of low spatial frequency and less data of high spatial frequency. Thus we can make the flat region smoother and but larger error in edges. Larger error in edges will not be detected by the eyes sometimes. The PSNR values of the frames reconstructed by proposed rate control algorithm are always smaller than those of the frames reconstructed by original rate control algorithm. This proves that the frame has higher visual quality may not have higher PSNR value.

Chapter 6 Conclusion and Future Work

6.1 Conclusion

The interframe wavelet video coding is a compression technique that provides flexible and multi-purpose scalability. The single created by interframe wavelet video coding can provide rate/SNR, temporal, and spatial scalability.

The study on HVS is become more important in recent years. The data of HVS is usually obtained from experiments. Because HVS has different response under different conditions, this is hard to find out a global useful formula for CSF or JND that can be accepted extensively.

We propose a weighting factor that can be used to convert the distortion measure of a truncation points to a visual weighted one. It is the product of the intra-subband weighting factor and inter-subband weighting factor. They are summarized below.

1) intra-subband weighting factor: It decides the visual importance of errors within the same subbands. The error smaller the JND of the corresponding subband has lower weighting because of the less importance to HVS.

2) inter-subband weighting factor: It decides the visual importance of errors in different subbands. If the values of the errors in different spatial subbands are the same, they have different visual importance to HVS. The error in lower spatial subband often has higher visual importance.

6.2 Future Work

We notice there are a few work items can be future explored.

1) The function of the minimum threshold provided by Watson is based on 9/7 linear phase filter [30]. We may need to derive a function that corresponding to the Daubechies 9/7 filter.

2) We assume the local luminance is constant across the whole image but it is not correct. We like to find another model to estimate the local luminance. The estimation of masking effect in lower spatial subbands can be improved. The masking effect in lower spatial subbands is usually very large. If we can estimate it with higher precision, we can get better weighting factor to do rate control and decrease the probability of the occurrence of visual error.

3) The proposed rate control algorithm is applicable to the luminance component of a picture. We like to extend it to the chrominance component. Watson suggests the minimum threshold function on chrominance [30] but the experiment results shows that visual responses on chrominance for different people is very different.

4) The proposed rate control algorithm is now used only on one spatial decomposed frame. We like to extend it to temporal domain. There is no clear model of minimum temporal threshold because the human eyes may track the moving objects and the resolution of static objects can be low. Finding an adequate model for temporal human vision can be a difficult and unsolved problem.

References

[1] W. P. Li, “Overview of Fine Granularity Scalability in MPEG-4 Video Standard”, IEEE Transactions on Circuits and Systems for Video Technology, vol.11, pp.

301-317, March 2001.

[2] J. -R. Ohm, “Three-dimensional subband coding with motion compensation”, IEEE Transactions on Image Processing, vol. 3:5, pp. 559-571, 1994.

[3] S. -T. Hsiang and J. W. Woods, “Embedded video coding using invertible motion compensated 3-D subband/wavelet filter bank”, Signal Processing: Image Communications, vol. 16, pp. 705–724, May 2001.

[4] D. Taubman and R. Rosenbaum, “Rate-distortion optimizes interactive browsing of JPEG2000 images”, Image Processing, 2003. September 2003.

[5] T. Kronander, “Motion compensated 3-dimensional wave-form image coding”, International Conference on Acoustic, Speech, and Signal Processing, vol. 3, pp.

1921-1924, 1989.

[6] S. T. Hsiang and J. W. Woods, “Invertible three-dimensional analysis/synthesis system for video coding with half-pixel-accurate motion compensation”, SPIE Conference on Visual Communication and Image Processing, vol. 3653, pp.

537-546, January 1999.

[7] B. Pesquet-Popescu, V. Bottreau, “Three-dimensional lifting schemes for motion compensated video compression”, International Conference on Acoustic, Speech, and Signal Processing, vol. 3, pp. 1793 -1796, 2001.

[8] J. M. Shapiro, “Embedded image coding using zerotrees of wavelet coefficients”, IEEE Transactions on Signal Processing, vol. 41:12, pp. 657-660, December 1992.

[9] D. Taubman, “High performance scalable image compression with EBCOT”, IEEE Transactions on Image Processing, vol. 9:7, pp. 1158-1170, July 2000.

[10] S. T. Hsiang and J.W. Woods, “Embedded image coding using zeroblocks of subband-wavelet coefficients and context modeling”, in Proceedings of IEEE International Symposium on Circuits and Systems, vol. 3:5, pp. 662-665, May 2000.

[11] J. W. Woods, “AHG on Digital Cinema Video Coding Technology”, ISO/IEC/JTC1 SC29/WG11 doc. No. m7645, Pattaya, December 2001.

[12] P. S. Chen and J. W. Woods, “Comparison of MC-EZBC and H.26L TM8 on Digital Cinema Test Sequences”, ISO/IEC JTC1/SC29/WG11 doc. No. m8130, Cheju Island, March 2002.

[13] P. S. Chen and J. W. Woods, “Improved MC-EZBC with Quarter-pixel Motion Vectors”, ISO/IEC JTC1/SC29/WG11 doc. No. m8366, FairFax, VA, May 2002.

[14] S. S. Tsai, H. M. Hang, T. Chiang, “Exploration Experiments on the Temporal Scalability of Interframe Wavelet Coding”, ISO/IEC/JTC1 SC29/WG11 doc. No.

m8959, Shanghai, October 2002.

[15] J. Xu et al, “3D subband video coding using Barbell lifting”, ISO/IEC JTC1/SC29/WG11, MPEG2004/M10569/S05, Munich, March 2004.

[16] J. Xu, Z. Xiong, S. Li, Y, Zhang, “Three-Dimensional Embedded Subband Coding with Optimized Truncation (3D ESCOT)”, Applied and Computational Harmonic Analysis 10, 290-315(2001), doi:10.1006/acha.2000.0345, available online at http://www.idealibrary.com .

[17] Y. Shoham, A.Gersho, “Efficient bit allocation for an arbitrary set of quantizers”, IEEE Transactions on Acoustics Speech Signal Process, Vol 36, No 9, pp1445-1443, September 1988.

[18] A. Aminlou, O. Fatemi, “Very Fast Bit Allocation Algorithm, Based On Simplified R-D Curve Modeling”, Electronics, Circuits and Systems, 2003.

ICECS 2003. Proceedings of the 2003 10th IEEE International Conference on, Volume: 1, 14-17, Pages:112 - 115 Vol.1 December 2003.

[19] A. N. Netravali and B. G. Haskell, Digital Images: Representation and Compression, 2nd ed., Plenum Press,｀95.

[20] S. Winkler, “Issue in vision modeling for perceptual video quality assessment”, Signal Processing 78, pp. 231-252, 1999.

[21] N. R. Carlson, Physiology of Behavior, Allyn and Bacon, ‘94.

[22] E. Peli, “Contrast in complex images”, J. Opt. Soc. Amer. A, vol. 7, pp.

2032-2039, October 1990.

[23] J. L. Mannos and D. J. Sakrison, “The effect of a visual fidelity criterion on the encoding of images”, IEEE Trans. Inform. Theory, vol. IT-20, pp. 525-536, July 1974.

[24] N. B. Nill, “A visual model weighted cosine transform for image compression and quality assessment”, IEEE Trans. Commun., vol. COM-33, pp. 551-557, June 1985.

[25] K. N. Ngan, K. S. Leong, and H. Singh, “Cosine transform coding incorporating human visual system model”, presented at SPIE Fiber ’86, Cambridge, MA, pp.

165-171, September 1986.

[26] D. H. Kelly, “Motion and vision II. stabilized spatial-temporal surface”, J. Opts.

Soc. Amer., vol. 69, pp. 1340-1349, October 1979.

[27] I. Vujovic, I. Kuzmanic, and M. Krcum, “Experimental Results in Visibility Threshold in Human Visual Perception for Application in Image/Video Coding Quality Assessment”, IEEE Region 8 International Symposium on Video/Image

Processing and Multimedia Communications 16-19, June 2002.

[28] C. –H. Chou and C. –W. Chen, “A Perceptually Optimized 3-D Subband Codec for Video Communication over Wireless Channels”, IEEE transactions on circuits and systems for video technology, vol.6, no.2, April, 1996.

[29] C. –H. Chou and Y. –C. Li, “A Perceptually Tuned Subband Image Coder Based on the Measure of Just-Noticeable-Distortion Profile”, IEEE Transactions on Circuits and Systems for Video Technology, vol.5, no.6, December 1995.

[30] A. B. Watson, G. Y. Yang, J. A. Solomon, and J. Villasenor, “Visibility of wavelet quantization noise”, IEEE Transactions on Image Processing, vol. 6, no. 8, pp.

1164-1175, August 1997.

[31] A. P. Bradley, “A wavelet visible difference predictor”, IEEE Transactions on Image Processing, vol. 8, no. 5, May 1999.

[32] S. Daly, “The visible difference predictor: An algorithm for the assessment of image fidelity”, in Digital Images and Human Vision, A. B. Watson, Ed.

Cambridge, MA: MIT Press, 1993, pp 176-206.

[33] J. M. Foley and Y. Yang, “Forward pattern masking: Effects of spatial frequency and contrast”, J. Opt. Soc. Amer. A, vol. 8, no. 12, pp. 2026-2037, December 1991.

[34] M. Kutter and S. Winkler, “A vision-based masking model of spread-spectrum image watermarking”, IEEE Trans. Image Processing, vol. 11, no. 1, January 2002.

[35] M. Antonini et al, “Image coding using wavelet transform”, IEEE Transactions on Image Processing, vol. 1, no. 2, pp. 205-221, April 1992.

[36] I. Hontsch and L. J. Karam, “Adaptive image coding with perceptual distortion control”, IEEE Transactions on Image Processing, vol. 11, no. 3, pp. 213-222,

March 2002.

[37] Z. Liu, L. J. Karam, and A. B. Watsom, “JPEG2000 encoding with perceptual distortion control”, IEEE Transactions on Image Processing, vol.1, 14-17, pp.

I-637-40, September 2003.

[38] E. Peli, “In search of a contrast metric: Matching the perceived contrast of Gabor patches at different phases and bandwidths,” Vision Res. 37(23), pp. 3217-3224, 1997.

作者簡歷

洪朝雄，男，臺灣彰化人，民國七十年一月二十一日生於桃園，家裡共有父母兄弟四人。民國九十二年六月國立交通大學電子工程學系畢業，民國九十二年九月進入國立交通大學電子研究所，從事影像壓縮方面的研究。

在文檔中用於畫面之間的小波轉換編碼以人類視覺系統為基礎的位元控制法 (頁 101-0)