Chapter 4 Experimental Results
4.2 Results and Remarks
4.2.2 Execution Time
Table 4-8. Execution time of FRDOT and CT (a) Target bitrate at 64 kbps
Execution time (ms) FRDOT CT Speedup
TS01 1506.67 8976.86 5.96
TS02 1876.54 13950.36 7.43
TS03 2056.71 13080.10 6.36
TS04 1927.80 12123.35 6.29
TS05 1728.21 10169.28 5.88
TS06 2336.52 12650.27 5.41
TS07 2366.13 13304.20 5.62
TS08 4485.70 27616.96 6.16
(b) Target bitrate at 80 kbps
Execution time (ms) FRDOT CT Speedup
TS01 1472.85 9527.58 6.47
TS02 1976.04 14820.40 7.50
TS03 2142.74 14064.85 6.56
TS04 2013.75 13358.86 6.63
TS05 1802.01 11096.01 6.16
TS06 2315.53 14162.97 6.12
TS07 2281.76 14189.57 6.22
TS08 4738.01 30991.32 6.54
(c) Target bitrate at 96 kbps
Execution time (ms) FRDOT CT Speedup
TS01 1483.69 10097.50 6.81
TS02 1953.99 15255.90 7.81
TS03 2064.48 15102.33 7.32
TS04 1927.72 14529.39 7.54
TS05 1724.09 11918.49 6.91
TS06 2309.95 15476.44 6.70
TS07 2362.43 15044.64 6.37
TS08 4559.58 32342.60 7.09
(d) Target bitrate at 112 kbps
Execution time (ms) FRDOT CT Speedup
TS01 1455.37 10409.99 7.15
TS02 1866.96 15717.70 8.42
TS03 1997.52 15344.27 7.68
TS04 1848.12 14868.23 8.05
TS05 1633.94 12337.01 7.55
TS06 2260.74 16755.97 7.41
TS07 2422.03 15988.21 6.60
TS08 4548.57 34319.95 7.55
The speedup is defined as the execution time of CT divided by the execution time of FRDOT. The execution time of eight test sequences at different levels of target bitrates is used for comparisons. As shown in Table 4-8, FRDOT is faster than CT by about 5 to 8 times. In addition, the speedup is increased as the target bitrates are enlarged, since the averaged number of bits to reduce is decreased and the processing time of FRDOT is shortened.
Chapter 5 Conclusion
In this chapter, we highlight the innovations of the proposed transcoding algorithm and give some conclusions on FRDOT. In addition, we draw some future work on the possible application of FRDOT to advanced audio coding standards.
5.1 Contributions
In this thesis, we presented a fast bitrate transcoding algorithm called as the FRDOT for real-time audio delivery applications. The major idea of the transcoder is to retain a better quality at a given bit budget under the NMR criterion. The NMR optimized transcoding is based on the search of the best scalefactor increment for the best rate-distortion performance.
To measure the distortion without the source audio signals, the NMR measure is represented by the NSR measure. Based on the NSR measure, we present a fast search algorithm of the best scalefactor increment at a given bitrate. To speed up the search, we reduce the total number of scalefactor increments to decrease the computation of NSR values. The reduced search range is found empirically. In addition, the NSR computation is saved by the table lookup technique. The lookup table presents the relationship between the scalefactor increments and the NSR values. The experiment results show that FRDOT is better than CT by 0.5-3 dB in NMR at different target bitrates. In addition, the results show that FRDOT is faster than CT by 5-8 times on the average.
In the FRDOT architecture, BCM provides a model to estimate the bit difference for spectral coefficients between the original and target bitrates. In addition, the BCM makes the averaged bitrate of transcoded bitstream close to the target bitrate. The results show that the averaged output bitrate of FRDOT is closer to the target bitrate than that of CT.
5.2 Future Works
MPEG-4 HE-AAC is the advanced compression approach for audio coding. MPEG-4 HE-AAC is applicable to the applications like satellite-delivered digital audio broadcast and
mobile telephony audio streaming. The audio content delivery applications require the bitrate adaptation mechanism. Since the MPEG-4 HE-AAC consists of MPEG-2/4 AAC system with LC Profile. Thus, the NMR optimized algorithm in FRDOT can be applied to MPEG-4 AAC and HE-AAC. The additional effort is to fit the AAC transcoding techniques with Spectral Band Replication (SBR) and Parametric Coding (PS) modules [5].
References
[1] J. Herre and B. Grill, “Overview of MPEG-4 audio and its applications in mobile communications,” in Proc. WCCC-ICSP, vol. 1, pp. 11–20, Aug. 2000.
[2] K. Brandenburg, “MP3 and AAC explained,” AES 17th Int. Conf. High-Quality Audio Coding, Italy, Aug. 1999.
[3] J. Zhou and J. Li, “Scalable audio streaming over the Internet with network-aware rate-distortion optimization,” in Proc. IEEE ICME, vol. 2, Aug. 2001, pp. 567-570.
[4] S. Quackenbush, “MPEG technologies: advanced audio coding,” ISO/IEC JTC1/SC29/WG11, Nice, FR, Oct. 2005.
[5] 3GPP TS 26.401, “General audio codec audio processing functions; Enhanced aacPlus general audio codec; General description,” Mar. 2005.
[6] ISO, Information technology-Generic Coding of Moving Pictures and Associated Audio, 1997. ISO/IEC JTC1/SC29, ISO/IEC IS 13818-7 (Part 7, Advanced audio coding).
[7] ISO, Information technology–Coding of Audio-Visual Objects, 1999. ISO/IEC JTC1/SC29, ISO/IEC IS 14496-3 (Part 3, Audio).
[8] H. Park, et al., “Multi-layer bit-sliced bit rate scalable audio coding,” AES 103rd Convention, New York, Aug. 1997 (preprint 4520).
[9] H. Hartenstein, et al., “High quality mobile communication,” in Proc. of KIVS 2001, Hamburg, Feb. 2001.
[10] A. Kassler and A. Schorr, “Generic QoS aware media stream transcoding and adaptation,” in Proc. of Packet Video Workshop, Nantes, France, Apr. 2003.
[11] http://www.shoutcast.com [12] http://www.allofmp3.com
[13] Y. Takamizawa, et al., “High-quality and processor-efficient implementation of an MPEG-2 AAC encoder,” in Proc. ICASSP, vol. 2, May 2001, pp. 985-988.
[14] T. Painter and A. Spanias, “Perceptual coding of digital audio,” in Proc. IEEE, Vol. 88, Issue 4, pp. 451-515, Apr. 2000.
[15] A. Aggarwal and K. Rose, “A conditional enhancement-layer quantizer for the scalable
MPEG advanced audio coder,” in Proc. ICASSP, vol. 2, May 2002, pp. 1833-1836.
[16] I. Dimkoviae, et al., “Fast software implementation of MPEG advanced audio encoder,”
in IEEE Int. Conf. Digital Signal Processing, vol. 2, July 2002, pp. 839 – 843.
[17] http://www.audiocoding.com
[18] C. Y. Lee, et al., “Efficient AAC single layer transcoder,” AES 117th Convention, San Francisco, Oct. 2004.
[19] Nakajima. Y, et al., “MPEG audio bit rate scaling on coded data domain,” in Proc.
ICASSP, vol. 6, May 1998, pp.3669-3672.
[20] Mat Hans, et al., “An MPEG audio layered transcoder,” AES 105th Convention, San Francisco, Aug. 1998 (preprint 4812).
[21] European Broadcasting Union, “Sound Quality Assessment Material: Recordings for Subjective Tests, “ Brussels, Belgium, Apr. 1988.
[22] Draft ITU-T Recommendation BS.1387 “Method for Objective Measurements of Perceived Audio Quality, ” July 2001,.
[23] A. Lerchs, “EAQUAL software, Version 0.1.3 alpha, “ http://www.mp3-tech.org
簡 歷
賴德亘: 民國 1981 年生於台北市。 2004 畢業於台灣新竹的國立交通大學 電子工程學系,之後進入該校電子工程所攻讀碩士學位。以音訊壓縮以及 位元率轉碼器為論文研究主題。
Te-Hsueh Lai was born in Taipei in 1981. He received the BS degree in Department of Electronics Engineering, National Chiao Tung University (NCTU), HsinChu, Taiwan in 2004. His current research interests are audio transcoder and audio streaming.