Proof of Zeros - 知覺式音訊編碼壓縮瑕疵之探討

CHAPTER 6 CONCLUSION

C.2. Proof of Zeros

To find the MMSE solution of (80), from the geometric symmetry of the solution, we might assume that r~₁ =r~₂ =r, θ~₁ =θ_, θ~₂ =π −θ _and θ∈_[₀_,π ₂_]. Then we can evaluate the integration in (80) as

)

Repeatedly substituting (C.13) into (C.12) to reduce the power of term r from 2 to 1 and using the trigonometric property 2sin²( ) + cos(2 ) = 1 can give

then substituting (C.15) into (C.14) yields (81). Similarly, by repeatedly substituting (C.12) and using the trigonometric property 2sin²( ) + cos(2 ) = 1, we can derive (C.11) as

BIBLIOGRAPHY

[1] Information Technology—Coding of Moving Pictures and Associated Audio for Digital Storage Media at Up to 1.5 mps- CD11172 (Part 3, Audio), ISO/IEC JTCI/SC29/WG1/N71, ISO/IEC, 1992.

[2] Coding of Moving Pictures and Audio—IS 13818-7 (MPEG-2 Advanced Audio Coding, AAC), ISO/IEC JTC1/SC29/WG11/N1650, ISO/IEC, 1997.

[3] Information Technology—Coding of Audiovisual Objects, ISO/IEC CD 14496 (Part 3, Audio), ISO/IEC, 1999.

[4] ATSC A/52, Digital Audio Compression (AC-3) Standard, United States Advanced Television Systems Committee.

[5] Bandwidth Extension, ISO/IEC JTC1/SC29/WG11/N5570, ISO/IEC, 14496-3:2001/FDAM1, Pattaya, Thailand, Mar. 2003.

[6] M. Dietz, L. Liljeryd, K. Kjörling, and O. Kunz, “Spectral band replication, a novel approach in audio coding,” in Proc. AES 112nd Conv., Munich, Germany, May 2002, preprint 5553.

[7] M. Wolters, K. Kjörling, D. Homm, and H. Purnhagen, “A closer look into MPEG-4 high efficiency AAC,” in Proc. AES 115th Conv., New York, USA, Oct. 2003, preprint 5871.

[8] P. Ekstrand, “Bandwidth extension of audio signals by spectral band replication,” in Proc.

1st IEEE Benelux Workshop on Model Based Process. Coding Audio, Leuven, Belgium, Nov. 2002, pp. 53-58.

[9] “3GPP TS 26.404, Enhanced aacPlus encoder SBR part,” June 2007.

[10] “Coding of moving pictures and audio, subpart 8: Technical description of parametric coding for high quality audio,” Draft ISO/IEC 14496-3 (Audio 3rd Edition).

[11] H. Purnhagen, “Low complexity parametric stereo coding in MPEG-4,”in Proc. 7th Int.

Conf. on Audio Effects (DAFX-04), Naples, Italy, Oct. 2004.

[12] E. Schuijers, J. Breebaart, H. Purnhagen, and J. Engdegård, “Low complexity parametric stereo coding,” in Proc. AES 116th conv., Berlin, Germany, May 2004, preprint 6073.

[13] C. Faller and F. Baumgarte, “Efficient representation of spatial audio using perceptual parametrization,” in Proc. IEEE Workshop on Applicat. Signal Process. Audio Acoust., New Paltz, NY, USA, 2001, pp. 199-202.

[14] F. Baumgarte and C. Faller, “Binaural cue coding - part i: psychoacoustic fundamentals and design principles,” IEEE Trans. Speech Audio Process., vol. 11, pp. 509-519, no. 6, Nov. 2003.

[15] C. Faller and F. Baumgarte, “Binaural cue coding - part ii: schemes and applications,”

IEEE Trans. Speech Audio Process., vol. 11, no. 6, pp. 520-531, Nov. 2003.

[16] C. Faller, “Parametric coding of spatial audio,” in Proc. 7th Int. Conf. on Audio Effects (DAFX-04), Naples, Italy, Oct. 2004.

[17] A. Seefeldt, M. S. Vinton, and C. Q. Robinson, “New techniques in spatial audio coding,” in Proc. AES 119th conv., New York, USA, Oct. 2005, preprint 6587.

[18] “AES technical committee of coding of audio signals: Perceptual audio coders: What to listen for”, Audio Engineering Society Publications, 2001, CD-ROM with tutorial information and audio examples.

[19] T. Painter and A. Spanias, “Perceptual coding of digital audio,” Proc. IEEE, vol. 88, no.

4, pp. 451-515, Apr. 2000.

[20] J. D. Johnston, “Perceptual audio coding—A history and timeline,” in Proc. 41st Asilomar Conf.on Signals, Systems, and Computers, Pacific Grove, CA, USA, Nov. 4-7, 2007, pp. 2085-2087.

[21] K. Brandenburg, “Low bitrate audio coding—state-of-the-art, challenges and future directions,” in Proc. IEEE Int. Conf. Commun. Technol. (ICCT’00), vol. 1, Beijing, China, Aug. 21-25, 2000, pp. 594-597.

[22] P. Noll, “High quality audio for multimedia: key technologies and MPEG standards,” in Proc. IEEE Global Telecommun. Conf. (GLOBECOM), vol. 4, Rio de Janeiro, Brazil, Nov.

1999, pp. 2045-2050.

[23] P. Noll, “MPEG digital audio coding,” IEEE Signal Process. Mag., vol. 14, pp. 59–81, Sep. 1997.

[24] A. C. den Brinker, J. Breebaart, P. Ekstrand, J. Engdegård, F. Henn, K. Kjörling, W.

Oomen, and H. Purnhagen, “An overview of the coding standard MPEG-4 audio amendments 1 and 2: HE-AAC, SSC, and HE-AAC v2,” EURASIP J. Audio, Speech, Music Process., vol. 2009, article ID 468971.

[25] “Plus V specification,” [online] http://www.mp3-tech.org/programmer/docs/plusv.pdf.

[26] M. H. Cheng and Y. H. Hsu, “Fast IMDCT and MDCT Algorithms—A Matrix Approach,” IEEE Trans. Signal Process., vol. 51, pp. 221-229, Jan. 2003.

[27] J. Herre and J. D. Johnston, “Enhancing the performance of perceptual audio coders by using Temporal Noise Shaping (TNS),” in Proc. AES 101st Conv., Los Angeles, CA, Nov.

1996, preprint 4384.

[28] J. Herre and J. D. Johnston, “Continuously signal-adaptive filterbank for high-quality perceptual audio coding,” in Proc. IEEE ASSP Workshop, Oct. 1997.

[29] J. Herre and J. D. Johnston, “Exploiting both time and frequency structure in a system that uses an analysis/synthesis filterbank with high frequency resolution,” in Proc. AES 103rd Conv., New York, Sep. 1997, preprint 5419.

[30] J. Herre, “Temporal noise shaping, quantization and coding methods in perceptual audio coding: A tutorial introduction,” in Proc.17th AES Int. Conf.: High-Quality Audio Coding, Sep. 1999, pp. 17–31.

[31] N. S. Jayant and P. Noll, Digital Coding of Waveforms. Englewood Cliffs, NJ:

Prentice–Hall, 1984.

[32] P. J. Davis, Circulant Matrices, New York: John Willey and Sons, 1979.

[33] C.M. Liu, W.C. Lee, C.H. Yang, K.Y. Peng, T. Chiou, T.W. Chang, Y.H. Hsiao, H.W.

Hsu, and C.T. Chien, “Design of AAC encoders,” in Proc. AES 117th Conv., San Francisco, USA, Oct. 2004, preprint 6201.

[34] E. Allamanche, R. Geiger, J. Herre, and T. Sporer, “MPEG-4 low delay audio coding based on the aac codec,” in Proc. AES 106th Conv., Munich, Germany, May 1999, preprint 4929.

[35] P. Marins, F. Rumsey, and S. Zielinski, “The relationship between selected artifacts and basic audio quality in perceptual audio codecs,” in Proc. AES 120th conv. Paris, France, May 2006, preprint 6745.

[36] C.M. Liu, W.C. Lee, and H.W. Hsu, “High frequency reconstruction by linear extrapolation,” in Proc. AES 115th Conv., New York, USA, Oct. 2003, preprint 5968.

[37] H.W. Hsu, C.M. Liu, and W.C. Lee, “Audio Patch method in audio decoders—MP3 and AAC,” in Proc. AES 116th Conv., Berlin, Germany, May 2004, preprint 6014.

[38] H.W. Hsu, C.M. Liu, W.C. Lee, and Z.W. Li, “Audio patch method in MPEG-4 HE AAC decoder,” in Proc. AES117th Convention, San Francisco, USA, Oct. 2004, preprint 6221.

[39] E. Larsen, M. Danessis, and R. Aarts, “Efficient high-frequency bandwidth extension of music and speech,” in Proc. AES 112nd Conv., Munich, Germany, May 2002, preprint 5627.

[40] R. M. Aarts, E. Larsen, and O. Ouweltjes, “A unified approach to low- and high-frequency bandwidth extension,” in Proc. AES 115th Conv., New York, USA, Oct.

2003, preprint 5921.

[41] A. J. S. Ferreira and D. Sinha, “Accurate spectral replacement,” in Proc. AES 118th conv., Barcelona, Spain, May 2005, preprint 6383.

[42] D. Sinha, A. J. S. Ferreira, and D. Sen, “A fractal self-similarity model for the spectral representation of audio signals,” in Proc. AES 118th conv., Barcelona, Spain, May 2005, preprint 6467.

[43] D. Sinha, A. Ferreira and E.V. Harinaryanan, “A novel integrated audio bandwidth extension toolkit (ABET),” in Proc. AES 120th conv., Paris, France, May 2006, preprint 6788.

[44] S. H. Oh, W. J. Yoon, Y. H. Cho, K. S. Park, and K. M. Kim, “A new spectral enhancement algorithm in MP3 audio,” IEEE Trans. Consumer Electron., vol. 52, no. 1, pp. 196-199, Feb. 2006.

[45] C. W. Kok, “Fast algorithm for computing discrete cosine transform,” IEEE Trans.

Signal Process., vol. 45, no. 3, pp. 757-760, Mar. 1997.

[46] B. G. Lee, “A new algorithm to compute the discrete cosine transform,” IEEE Trans.

Acoust., Speech, Signal Process., vol. 32, no. 6, pp. 1243 -1245, Dec. 1986.

[47] G. Bi and Lee W. Yu, “DCT algorithms for composite sequence lengths,” IEEE Trans.

Signal Process., vol. 46, no. 3, pp. 554-562, Mar. 1998.

[48] G. Bi, “Fast algorithms for type-III DCT of composite sequence lengths,” IEEE Trans.

Signal Process., vol. 47, no. 7, pp. 2053-2059, July 1990.

[49] Z. WANG, “Fast algorithms for the discrete W transform and for the discrete Fourier transform,” IEEE Trans. Acoust., Speech, Signal Process., vol. 32, no. 4, pp. 803-816, Aug. 1984.

[50] V. Britanak, “The fast DCT-IV/DST-IV computation via the MDCT,” Signal Process., vol. 83, no. 8, pp. 1803-1813, Aug. 2003.

[51] B. C. J. Moore, An Introduction to the Psychology of Hearing, 2nd ed., New York:

Academic, 1982.

[52] A.V. Oppenheim, R.W. Schafer, and J.R. Buck, Discrete-Time Signal Processing, 2nd ed. Upper Saddle River, NJ: Prentice-Hall, 1999.

[53] S. M. Kay and S. L. Marple, Jr., “Spectrum analysis a modern perspective,” Proc.

IEEE, vol. 69, no. 11, pp. 1380–1419, Nov. 1981.

[54] S. M. Kay, “Maximum entropy spectral estimation using the analytic signal,” IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-26, pp. 467-469, Oct. 1978.

[55] L. B. Jackson, D. W. Tufts, F. K. Soong and R. M. Rao, “Frequency estimation by linear prediction,” in Proc. IEEE ICASSP, pp.332-356, 1978.

[56] S. M. Kay, “Fourier-autoregressive spectral estimation,” in Proc. IEEE ICASSP, pp.162-165, 1979.

[57] M. P. Quirk and B. Liu, “Improving resolution for autoregressive spectral estimation by decimation,” IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-31, pp.630-637, Jun. 1983.

[58] T. Shimamura, N. Sakaguchi, and S. Takahashi, “Frequency estimation using the analytic signals by decimation,” in Proc. 1989 IEEE Pacific Rim Conf. on Commun., Computers, Signal Process. (PACRIM 1989), Victoria, B.C., Canada, 1-2 Jun. 1989, pp.540-543.

[59] M. Briand, D. Virette, and N. Martin, “Parametric coding of stereo audio based on principal component analysis,” in Proc. 9th Int. Conf. on Audio Effects (DAFX-06), Montreal, Canada, Sep. 2006.

[60] M. Briand, D. Virette, and N. Martin, “Parametric representation of multichannel audio coding based on principal component analysis,” in Proc. AES 120th conv., Paris, France, May 2006, preprint 6813.

[61] C. H. Yang, C. M. Liu, H. W. Hsu, K. C. Lee, S. H. Tang, Y. C. Yang, C. M. Chang, and W. C. Lee, “Design of MPEG-4 HEAAC version 2 encoder,” in Proc. AES 121st Conv., San Francisco, USA, Oct. 2006, preprint 6873.

[62] A. Rezayee and S. Gazor, “An adaptive KLT approach for speech enhancement,” IEEE Trans. Speech Audio Process., vol. 9, no. 2, pp. 87-95, Feb. 2001.

[63] Charpentier F. and Stella M., “Diaphone synthesis using an overlap-add technique for speech waveforms concatenation,” in Proc. IEEaE Int. Conf. Acoust. Speech Signal Process., Tokyo, Japan, 1986, pp. 2015-2018.

[64] J. Makhoul, “Linear prediction: a tutorial review,” Proc. IEEE, vol. 63, no. 4, pp.

561–580, Apr. 1975.

[65] Audio lossless coding (ALS), new audio profiles and BSAC extensions, ISO/IEC, 14496-3:2005/Amd 3:2006.

[66] A. Härmä and U. K. Laine, “A comparison of warped and conventional linear predictive coding,” IEEE Trans. Speech, Audio, Process., vol. 9, no. 5, pp. 579–588, July 2001.

[67] M. Deriche and D. Ning, “A novel audio coding scheme using warped linear prediction model and the discrete wavelet transform,” IEEE Trans. Audio, Speech, Lang. Process., vol. 14, no. 6, pp. 2039–2048, Nov. 2006.

[68] R. Kumaresan, “An inverse signal approach to computing the envelope of a real valued signal,” IEEE Signal Process. Lett., vol. 5, no. 10, pp. 256–259, Oct. 1998.

[69] R. Kumaresan and A. Rao, “Model-based approach to envelope and positive instantaneous frequency estimation of signals with speech applications,” J. Acoust. Soc.

Amer., vol. 105, no. 3, pp. 1912–1924, Mar. 1999.

[70] R. Kumaresan, “On minimum/maximum/all-pass decompositions in time and frequency domains,” IEEE Trans. Signal Process., vol. 48, no. 10, pp. 2973–2976, Oct. 2000.

[71] R. Kumaresan and Y.Wang, “On the relationship between line-spectral frequencies and zero-crossings of signals,” IEEE Trans. Speech Audio Process., vol. 9, no. 4, pp. 458–461, May 2001.

[72] R. Kumaresan and Y. Wang, “On representing signals using only timing information,” J.

Acoust. Soc. Amer., vol. 110, no. 5, pp. 2421–2439, Nov. 2002.

[73] M. Athineos and D. Ellis, “Autoregressive modeling of temporal envelopes,” IEEE Trans. Signal Process., vol. 55, no. 11, pp. 5237–5245, Nov. 2007.

[74] J. P. Princen, A. W. Johnson, and A. B. Bradley, “Subband/transform coding using filter bank designs based on time domain aliasing cancellation,” in Proc. IEEE ICASSP’87, Dallas, TX, Apr. 1987, pp. 2161–2164.

[75] H. S. Malvar, Signal Processing with Lapped Transforms, Norwood, MA: Artech House, 1992.

[76] S. A. Martucci, “Symmetric convolution and the discrete sine and cosine transforms,”

IEEE Trans. Signal Process., vol. 42, no. 5, pp.1038–1051, May 1994.

[77] G. Bongiovanni, P. Corsini, and G. Frosini, “One-dimensional and two-dimensional generalized discrete Fourier transforms,” IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP–24, pp. 97–99, Feb. 1976.

[78] J. L. Vemet, “Real signals fast Fourier transform: storage capacity and step number reduction by means of an odd discrete Fourier transform,” Proc. IEEE, vol. 59, pp.

1531–1532, Oct. 1971.

[79] L. S. Marple, Jr., “Computing the discrete-time ‘analytic’ signal via FFT,” IEEE Trans.

Signal Process., vol. 47, no. 9, pp. 2600–2603, Sep. 1999.

VITA

Han-Wen Hsu was born in Tainan, Taiwan in Oct. 1977. He received his B.S. degree from the Division of Applied Mathematics, the Department of Mathematics, National Tsing Hua University, Hsinchu, Taiwan in June 2000. He received his M.S. degree from the Department of Computer Science and Information Engineering, National Chiao Tung University, Hsinchu, Taiwan in June 2004, where he is currently working toward a Ph. D. degree in Computer Science and Engineering. His research interests are in audio coding and signal processing.

PUBLICATION LIST

Journal Papers

1. C.M. Liu, H.W. Hsu, and W.C. Lee, “Compression artifacts in perceptual audio coding,” IEEE Trans. Audio Speech Lang. Process., vol. 16, no. 4, pp. 681-695, May 2008.

2. H.W. Hsu and C.M. Liu, “Fast radix-q and mixed-radix algorithms for type-IV DCT,”

IEEE Signal Process. Lett., vol. 15, pp.910-913, Dec. 2008.

3. H.W. Hsu and C.M. Liu, “Autoregressive modeling of temporal/spectral envelopes with finite-length discrete trigonometric transforms,” Accepted with minor revisions, IEEE Trans. Signal Process., 2010.

4. H.W. Hsu and C.M. Liu, “Decimation-whitening filter in spectral band replication,”

submit to IEEE Trans. Audio Speech Lang. Process., 2010.

Conference Papers

1. D.P. Chen, H.F. Hsiao, H.W. Hsu, and C.M. Liu, “Gram-Schmidt-based downmixer and decorrelator in the MPEG Surround Coding,” appear to Proc. AES 128th Conv., London, UK, May 22–25, 2010.

2. C. M. Liu, H. W. Hsu, Y. H. Kao, and W. C. Lee, “Spatial parameter decision by least squared error in Parametric Stereo Coding and MPEG Surround,” in Proc. AES 126th Conv., Munich, Germany, May 7–10, 2009, preprint 7731.

3. C.M. Liu, H.W. Hsu, C.H. Yang, and W.C. Lee, “Low-power MPEG-4 HE-AAC version-2 encoder,” in Proc. AES 124th Conv., Amsterdam, Netherlands, May 17–20, 2008, preprint 7338.

4. C.M. Liu, C.H. Yang, H.W. Hsu, and W.C. Lee, “Design of framing in MPEG Surround based on dynamic programming algorithm,” in Proc. AES 124th Conv., Amsterdam, Netherlands, May 17–20, 2008, preprint 7487.

5. H.W. Hsu, C.L Hu, C.M Liu, and W.C. Lee, “On the design of low power MPEG-4 HE-AAC encoders,” in Proc. AES 122nd Conv., Vienna, Austria, May 5–8, 2007, preprint 6999.

6. H.W. Hsu, H.Y. Tseng, C.M Liu, W.C. Lee, and C. H. Yang, “High quality, low power QMF bank design for SBR, Parametric Coding, and MPEG Surround decoders,” in Proc.

AES 122nd Conv., Vienna, Austria, May 5–8, 2007, preprint 7000.

7. C.H. Yang, C. M. Liu, H.W. Hsu, K.C. Lee, S.H. Tang, Y.C. Yang, C.M. Chang, and W.C.

Lee, “Design of HE-AAC version 2 encoder,” in Proc. AES 121st Conv., San Francisco, USA, October 5–8, 2006, preprint 6873.

8. H.W. Hsu, C.M. Liu, and W.C. Lee, “Fast complex quadrature mirror filterbanks for

MPEG-4 HE-AAC,” in Proc. AES 121st Conv., San Francisco, USA, October 5–8, 2006, preprint 6871.

9. C. M. Liu, H. W. Hsu, C. H. Yang, K. C. Lee, S. H. Tang, Y. C. Yang, and W. C. Lee,

“Compression artifacts in perceptual audio coding,” in Proc. AES 121st Conv., San Francisco, USA, October 5–8, 2006, preprint 6872.

10. H.W Hsu, Y.C Yang, C.M. Liu, and W.C Lee, “ Design for high frequency adjustment module in MPEG-4 HEAAC encoder based on linear prediction method,” in Proc.

AES120th Conv., Paris, France, May 20–23, 2006, preprint 6755.

11. K.C. Lee, C.H. Yang, H.W. Hsu, W.C. Lee, C.M. Liu, and T.W. Chang, “Efficient design of time-frequency stereo parameter sets for parametric HE-AAC,” in Proc. AES 119th Conv., New York, USA, October 7–10, 2005, preprint 6600.

12. C.M. Liu, L.W. Chen, H.W. Hsu, and W.C. Lee, “Bit reservoir design for HE-AAC,” in Proc. AES 118th Conv., Barcelona, Spain, May 28-31, 2005, preprint 6382.

13. C.M. Liu, W.C. Lee, C.H. Yang, K.Y. Peng, T. Chiou, T.W. Chang, Y.H. Hsiao, H.W.

Hsu, and C.T. Chien, “Design of MPEG-4 AAC encoders,” in Proc. AES 117th Conv., San Francisco, USA, Oct. 28-31, 2004, preprint 6201.

14. H.W. Hsu, C.M. Liu, W.C. Lee, and Z.W. Li, “Audio patch method in MPEG-4 HE AAC decoder,” in Proc. AES 117th Conv., San Francisco, USA, Oct. 28-31, 2004, preprint 6221.

15. H. W. Hsu, C. M. Liu, and W. C. Lee, “Audio patch method in audio decoders—MP3 and AAC,” in Proc. AES 116th Conv., Berlin, Germany, May 8-11, 2004, preprint 6014.

16. C. M. Liu, W. C. Lee, and H. W. Hsu, “High frequency reconstruction for band-limited audio signal,” in Proc. the 6th Int. Conf. Digital Audio Effects (DAFX-03), University of London, September 8-11, 2003.

17. C. M. Liu, W. C. Lee, and H. W. Hsu, “High frequency reconstruction by linear extrapolation”, in Proc. AES115th Conv., New York, USA, Oct. 10-13, 2003, preprint 5968.

在文檔中知覺式音訊編碼壓縮瑕疵之探討 (頁 140-151)