In the experiments, we replace the spatial wavelet transform in MSSVC software by PDFB. The PDFB is constructed by a three-level multiscale decomposition.
Because of the multiscale decomposition, the PDFB structure still provides resolution scalability. The numbers of directional decomposition from the first level to the third level are 8, 4 and 4 (see Figure 5.27). The corresponding frequency partition is shown in Figure 5.28. We use Laplacian pyramid to achieve multiscale decomposition and DFB to achieve directional decomposition.
(2,2)
(2,2)
(2,2)
1st level 2nd level 3rd level
Figure 5.27: The block diagram of PDFB. In the experiments, low-pass channel generates the coarse resolution image. The high-pass image is passed to different directional decompositions at different
levels.
(π,π)
(-π,-π)
ω1 ω2
Figure 5.28: The corresponding frequency partition of architecture in Figure 5.27. The number of directional frequency partition is decreased from the higher frequency bands to the lower frequency
bands.
The program is developed based on the MSSVC software. Thus, we compare the performance of the original scheme (MSSVC) and the same scheme with spatial transform replaced by multiresolution directional transform (MSSVC-MDT). We also
compare the results with JPEG2000.
Barbara, fingerprint, Lena and peppers are the test images. The PSNR of different schemes are compared with respect to compression ratios of 0.625%, 0.9375%, 1.25%, 5% and 10%. Table 5.3 and Figure 5.29 show the results of Barbara.
Table 5.4 and Figure 5.30 show the results of Barbara. Table 5.5 and Figure 5.31 show the results of Barbara. Table 5.6 and Figure 5.32 show the results of Barbara.
Table 5.3: PSNR of Barbara
barbara
256(k byte) ratio(%) PSNR
512(width) MSSVC JPEG2000 MSSVC_MDT
512(height) 0.625 21.41 22.44 22.43
0.9375 23 23.33 22.8
Figure 5.29: PSNR performance comparison of MSSVC, JPEG2000 and MSSVC_MST for Barbara.
Table 5.4: PSNR of fingerprint
fingerprint
256 (k byte) ratio(%) PSNR
512 (width) MSSVC JPEG2000 MSSVC_MDT
512 (height) 0.625 17.26 18.26 18.08
0.9375 18.79 19.41 19.68
1.25 20.62 20.75 20.53
5 27.48 26.63 25.53
Figure 5.30: PSNR performance comparison of MSSVC, JPEG2000 and MSSVC_MST for fingerprint.
Table 5.5: The PSNR list of Lena
lena
256(k byte) ratio(%) PSNR
512(width) MSSVC JPEG2000 MSSVC_MDT
512(height) 0.625 25.16 26.75 26.29
0.9375 27.31 28.03 27.5
1.25 28.57 29.31 28.25
5 35.76 35.31 33.64
10 38.87 38.39 36.32
0
Figure 5.31: PSNR performance comparisons of MSSVC, JPEG2000 and MSSVC_MST for Lena.
Table 5.6: PSNR of peppers
peppers
256(k byte) ratio(%) PSNR
512(width) MSSVC JPEG2000 MSSVC_MDT
512(height) 0.625 23.77 26.02 25.37
0.9375 26.9 28.01 26.34
Figure 5.33 (a) and (b) show the reconstructed image of fingerprint using MDT and wavelet at the ratio of 0.9375. Figure 5.34(a) and (b) show the reconstructed grayscale image “barbara” coded by MDT and original scheme at compression ratio of 1.25%. From the PSNR figures, it can be observed that MSSVC_MDT have better performance than the original scheme-MSSVC at the very low compression ratios such as 0.625% and 0.9375%. The PSNR difference become larger if many line-like singularity components present in the test images such as Barbara and fingerprint.
Especially on the visual quality, the line elements are much more clear with MSSVC_MDT at the low compression rates. However, when the compression ratio is higher such as 5% or 10%, the PSNR performance of MDT scheme is not as good as that of MSSVC or JPRG2000.
(a)
(b)
Figure 5.33: The reconstructed image of fingerprint at the compression ratio of 0.9375 (a) The reconstructed image by using MDT, (b) The reconstructed image by using original scheme-wavelet.
(a)
(b)
Figure 5.34: Reconstructed image “barbara” at the compression ratio 1.25%. (a) MDT, (b) the original MSSVC scheme.
Chapter 6
Conclusion and Future Work
In Chapter 4, we propose an enhanced entropy coding scheme to further increase the compression efficiency of the interframe wavelet coding algorithm. We modify the entropy coding unit by adding an extra SB-reach layer. Several test conditions
specified by the core experiment are tested. So far, our proposed algorithm has somewhat better performance at low- to mid-bitrates comparing to the MPEG Core Experiment (CE) reference software. Further parameter tuning should provide better results, and the full potential of this technique is yet to be further explored.
In Chapter 5, we designed and implement the directional multiresolution
subjective quality are better at low bit rates especially on the image with lots of line singularities. However, the PSNR loss at higher compression ratios needs to be improved. The possible methods to improve its performance are as fallows.
Our program still used the original rate-distortion control. There should have a lot of space to adjust the RD control scheme to match the directional
decompositions. It is also an important subject to look into the relationship between the human visual system (HVS) and the directional transform.
In the thesis, the decomposing numbers in directional filter banks we choose are 8, 4 and 4 from the first resolution level to the last based on the experiments in[13]. However, with combining different resolution decomposition levels and the numbers of directional decomposition, many kinds of transform structures can be generated. The better parameter values in selecting filter structure for compression still needs to be further studied.
We use Laplacian Pyramid to do the resolution decomposition and, as we known, it is an oversampling scheme. Should it be replaced by other filters? The impact of this resolution reduction filter can be another research topic
Reference
[1]. T. Kronander, “Motion compensated 3-dimensional wave-form image coding”, International Conference on Acoustic, Speech, and Signal Processing, vol. 3, pp 1921-1924, 1989.
[2]. J.-R. Ohm, “Three-dimensional subband coding with motion compensation,”
IEEE Trans. Image Processing, vol. 3, no. 5, pp 559–571, Sep. 1994.
[3]. J.M. Shapiro, “Embedded image coding using zerotrees of wavelet coefficients”, IEEE Transactions on Signal Processing, vol. 41:12, pp657-660, December 1992.
[4]. A. Said, W. Pearlman, “A new, fast and efficient image codec based on set partitioning in hierarchical trees”, IEEE Transactions on Circuit and Systems for Video Technology, vol. 6:3, pp243-250, March 1996.
[5]. S.T. Hsiang and J.W. Woods, “Embedded image coding using zeroblocks of subband-wavelet coefficients and context modeling”, in Proceedings of IEEE International Symposium on Circuits and Systems, vol. 3:5, pp 662-665, May 2000.
[6]. T. Rusert, et al., Recent improvements to MC-EZBC, ISO/IEC/JTC1 SC29/WG11 doc. M9232, Dec. 2002.
[7]. D. Taubman, “High performance scalable image compression with EBCOT”, IEEE Transactions on Image Processing, vol. 9:7, pp1158-1170, July 2000.
[8]. D. Taubman, “EBCOT (Embedded Block Coding with Optimized Truncation) A complete reference”, ISO/IEC JTC1/SC29/WG1 N988, Sept.1998
[9]. J. Xu, et al, “3-D subband video coding using Barbell lifting”, ISO/IEC/JTC1 SC29/WG11, MPEG 2004 m10569/S05, March 2004.
[10]. W.-H. Peng, T. Chiang and H.-M. Hang, “Context-based binary arithmetic coding for fine granularity scalability,” Proceedings, Seventh International Symposium on Signal Processing and Its Applications, vol. 3, pp.105-108, July 1-4, 2003
[11]. J. Xu, Z. Xiong, S. Li, and Y. Q. Zhang, “Three-dimensional embedded subband coding with optimized truncation (3-D ESCOT),” Applied and Computational Harmonic Analysis, vol. 10, pp 290–315, May 2001.
[12]. E. J. Cand`es. “Ridgelets: Theory and Applications”, PhD thesis, Department of Statistics, Stanford University, 1998.
[13]. M. N. Do, “Directional Multiresolution Image Representations”, Ph.D. Thesis, Department of Communication Systems, Swiss Federal Institute of Technology Lausanne, November 2001.
[14]. P. J. Burt and E. H. Adelson. “The Laplacian pyramid as a compact image code”, IEEE Transactions on Communications, vol. 31:4, pp 532–540, April 1983.
[15]. A. Cohen, I. Daubechies, and J.-C. Feauveau. “Biorthogonal bases of compactly supported wavelets”. Commun. on Pure and Appl. Math., vol.45, pp 485–560, 1992.
[16]. R. H. Bamberger and M. J. T. Smith. “A filter bank for the directional decomposition of images: Theory and design”, IEEE Transactions on Signal Processing, vol. 40:4, pp 882–893, April 1992.
[17]. S.-M. Phoong, C. W. Kim, P. P. Vaidyanathan, and R. Ansari. “A new class of two-channel biorthogonal filter banks and wavelet bases”, IEEE Transactions on Signal Processing, vol. 43:3, pp 649–665, Mar. 1995.
[18]. M. Vetterli. “Multidimensional subband coding: Some theory and algorithms”, Signal Processing, vol. 6:2, pp 97–112, Feb. 1984.
作者簡歷
徐漢光,男,民國七十年五月一日生,大學就讀於國立交通大學電機與控制 工程學系,畢業後進入交通大學電子工程學系系統組就讀研究所,指導老師為杭 學鳴教授,從事影像壓縮領域相關研究。