Chapter 6 Experimental Results
6.4 Subjective Quality Measurement
In section 6.2, the ODG of some test tracks show degradation when comparing method M2 with M0 which is unexpected. The quality improvement will be supported by listening test in this section as shown in Figure 27. Most of the test tracks are improved by tone/noise compensation except si02, which is a transient signal. The reason of quality degradation is due to the assumption of MLD algorithm, that the input audio signal is stationary. Therefore if the input signal is transient, the tonality measurement will lose the accuracy and therefore results in improper compensation.
Figure 27: Results of listening test on MPEG test tracks at bit rate 80kbps.
38
Chapter 7
Concluding Remark
In this thesis, an efficient method based on Levinson-Durbin algorithm is proposed to measure the tonality by linear prediction approach, which can search the adaptive orders automatically to fit different subband contents. Further more, the MLD algorithm also provides an efficient method to measure the tonality of inverse filtered signal at the encoder end. Through the calculation for autocorrelation of inverse filtered signal, the inverse filtering process is avoided. The sharing of control parameter is also concerned, and an optimal decision criterion of control parameter is proposed. The causes leading to two typical artifacts in HE-AAC codec are pointed out and a remedial method to reduce the noise overflow phenomenon is proposed.
Both objective and subjective tests are conducted to check the quality improvement.
Other publication of this thesis is in [15].
Table 11~12 shows the ODG comparison with method M3 and existing codec (CodingTechnologies aacplus 7.0.5; NERO HE-AAC 3.0.0.0).
Table 11: ODG comparison with existing codec at bit rate 80kbps.
Bit Rate 80k
Codec NCTU-HEAAC CT 7.0.5 NERO 3.0.0.0
es01 -0.73 -0.78 -1.46
es02 -0.65 -0.97 -2.14
es03 -0.75 -0.8 -2.4
sc01 -1.01 -1.05 -1.17
sc02 -1.25 -1.54 -1.22
sc03 -1.19 -1.17 -1.23
si01 -1.65 -2.3 -2.03
si02 -1.03 -1.03 -1.81
si03 -1.64 -1.9 -2.07
sm01 -1.6 -1.97 -2.14
sm02 -1.6 -2.27 -1.82
sm03 -1.35 -1.28 -1.39
Max -0.65 -0.78 -1.17
Min -1.65 -2.3 -2.4
Average -1.20417 -1.4217 -1.74
Table 12: ODG comparison with existing codec at bit rate 64kbps.
Bit Rate 64k
Codec NCTU-HEAAC CT 7.0.5 NERO 3.0.0.0
es01 -1.01 -1.01 -1.62
es02 -0.87 -1.48 -2.28
es03 -1.03 -1.04 -2.19
sc01 -1.66 -1.66 -1.96
sc02 -1.74 -2.35 -1.6
sc03 -1.63 -1.65 -1.73
si01 -1.99 -2.84 -2.77
si02 -1.57 -1.6 -2.24
si03 -2.09 -2.19 -2.86
sm01 -2.17 -2.73 -3.14
sm02 -2.51 -2.72 -2.25
sm03 -1.72 -1.71 -2.14
Max -0.87 -1.01 -1.6
Min -2.51 -2.84 -3.14
Average -1.6658 -1.915 -2.2317
Table 13: ODG comparison with existing codec at bit rate 48kbps.
Bit Rate 48k
Codec NCTU-HEAAC CT 7.0.5 NERO 3.0.0.0
es01 -1.61 -2.23 -1.97
es02 -1.42 -2.65 -2.49
es03 -1.7 -2.41 -2.7
sc01 -2.4 -2.65 -2.79
sc02 -2.62 -3.1 -2.64
sc03 -2.33 -3.05 -2.88
si01 -2.74 -3.42 -3.64
si02 -2.46 -2.73 -3.47
si03 -3.17 -2.86 -3.83
sm01 -3.2 -3.61 -3.85
sm02 -3.38 -3.32 -3.28
sm03 -2.26 -3.15 -3.08
Max -1.42 -2.23 -1.97
Min -3.38 -3.61 -3.85
40
From the result of objective test, NCTU is better than CT than NERO. From subjective test, NCTU is better than CT, and CT is better than NERO On speech signal; CT is better than NCTU, and NCTU is better than NERO on attack signal.
Future work
1. The assumption of MLD algorithm is that the input audio signal is stationary.
Therefore if the input signal is transient, the tonality measurement will lose the accuracy.
2. The inverse filtering mode decision in current design is a heuristic method. The inverse filtered tonal energy Tinv is approximated by Tinv =Tr⋅
(
1−α)
2, where Tr is the tonal energy before inverse filtering. A more accurate decision criterion should be considered.3. The interpolation mode decision in current design is that if there is a tone exists in the original HF signal, the interpolation mode is set to 0. Otherwise, the interpolation mode is set to 1. The potential risk of interpolation mode switching need to be analyzed.
References
[1] US Patent, No.:US 6708145 B1, Mar. 16, 2004, “Enhancing Perceptual Performance of SBR and Related HFR Coding Methods by Adaptive Noise-Floor Addition and Noise Substitution Limiting”.
[2] San-Uk Ryu and Kenneth Roth, “Enhanced Accuracy of the Tonality Measure and Control Parameter Extraction Modules in MPEG-4 HE-AAC,” Proc. 119th AES Convention, Preprint 6586, Oct. 2005.
[3] N. Levinson, “The Wiener RMS (Root Mean Square) Error Criterion in Filter Design and Prediction,” J. Math. Phys. 25, 261-278 (1947).
[4] 3GPP TS 26.404: “Enhanced aacPlus encoder SBR part,” June 2004.
[5] ISO/IEC JTC1/SC29/WG11, “Text of ISO/IEC 14496-3:2001/FDAM1, Bandwidth Extension,” ISO/IEC JTC1/SC29/WG11 N5570, Mar. 2003.
[6] ISO/IEC JTC1/SC29 WG11 MPEG, “Text of ISO/IEC 14496-3:2001/AMD 1:2003, bandwidth extension,” Nov. 2003.
[7] M. Dietz, L. Liljeryd, K. Kjorling, and O. Kunz, "Spectral band replication, a novel approach in audio coding," Proc. 112th AES Convention, Preprint 5871, May 2002.
[8] P. Ekstrand, "Bandwidth extension of audio signals by spectral band replication,"
Proc. 1st IEEE Benelux Workshop on Model based Processing and Coding of Audio, Nov. 2002.
[9] A. Ehret, M. Dietz, and K. Kjorling, "State-of-the-art audio coding for broadcasting and mobile applications," Proc. 114th AES Convention, Preprint 5560, Mar. 2003.
[10] T. Ziegler, A. Ehret, P. Ekstrand, and M. Lutzky, "Enhancing mp3 with SBR:
features and capabilities of the new mp3pro algorithm," Proc. 112th AES Convention, Preprint 5560, May 2002.
42
[11] M. Wolters, K. Kjorling, D. Homm and H. Purnhagen, "A closer look into MPEG-4 high efficiency AAC," Proc. 115th AES Convention, Preprint 5871, Oct. 2003.
[12] C.M. Liu, W. C. Lee, C. H. Yang, K. Y. Pang, T. Chiou, T. W. Chang, Y. H.
Hsiao, H. W Hsu, C. T. Chien, “Design of MPEG-4 AAC Encoder,” Proc. 117th AES Convention, Preprint 6201, Oct. 2004,.
[13] H.W. Hsu, C.M. Liu, and W.C. Lee, “Audio Patch Method in MPEG-4 HE AAC Decoder,” Proc. 117th AES Convention, Preprint 6221, Oct. 2004.
[14] C.M. Liu, L.W. Chen, H.W. Hsu, and W.C. Lee, “Bit Reservoir Design for HE-AAC,” Proc. 118th AES Convention, Preprint 6382, May 2005.
[15] Han-Wen Hsu, Yung-Cheng Yang, Chi-Min Liu, and Wen-Chieh Lee,”Design for High Frequency Adjustment Module in MPEG-4 HEAAC Encoder based on Linear Prediction Method,” Proc. 120th AES Convention, Preprint 6755, May 2006.
[16] ITU Radiocommunication Study Group 6, “Draft Revision to Recommendation ITU-R BS.1387- Method for objective measurements of perceived audio quality”.
[17] MUSHRA, website: http://ff123.net/abrhr/abchr.html.
[18] ITU Radiocommunication Sector BS.1116 (rev.1). “Methods for the Subjective Assessment of Small Impairments in Audio Systems Including Multichannel Sound Systems,” Geneva, 1997.
[19] NCTU-HEAAC, website:
http://psplab.csie.nctu.edu.tw/projects/index.pl/nctu-mp3.html.
[20] The Audio Database Collected in Perceptual Signal Processing Lab, website:
http://psplab.csie.nctu.edu.tw/projects/index.pl/testbitstreams.html.
[21] Samples for Testing Audio Codecs from ff123, website:
http://ff123.net/samples.html.
[22] Quality and Listening Test Information for LAME, website:
http://lame.sourceforge.net/gpsycho/quality.html.
[23] Hydrogen Audio, website: http://www.hydrogenaudio.org.
[24] OGG Vorbis Pre 1.0 Listening Test, website:
http://hem.passagen.se/ingets1/vorbis.htm.
[25] Phong’s Audio Samples, website: http://www.phong.org/audio/samples.xhtml.
[26] Sound Quality Assessment Material, website:
http://sound.media.mit.edu/mpeg4/audio/sqam/.