參考文獻 - 聲調特徵擷取技術與其在中文聲調辨識應用之研究

[1] 林燾、王理嘉, 語音學教程. 臺北: 五南圖書出版有限公司, 1995.

[2] Dinoj Surendran and Gina-Anne Levow, "Can voice quality improve mandarin tone recognition?," in Proc. ICASSP, Las Vegas, pp. 4177-4180, 2008.

[3] Ruo-Xiao Yang, "The phonation factor in the categorical perception of mandarin tones," in Proc. of ICPhS XVII, Hong Kong, 2011.

[4] David Talkin, "A robust algorithm for pitch tracking (RAPT)," in Speech coding and synthesis.: Elsevier Science, 1995, vol. 495, p. 518.

[5] Paul Boersma, "Praat, a system for doing phonetics by computer.," Glot International, vol. 5, no. 9/10, pp. 341-345, Jun 2001.

[6] 趙元任, 中國話的文法. 台北: 敦煌書局, 1981.

[7] 古鴻炎、張小芬、吳俊欣, "仿趙氏音高尺度之基週軌跡正規化方法及其應用,"

於第十六屆自然語言與語音處理研討會, 台北, 2004.

[8] Si Wei, Hai-Kun Wang, Qing-Sheng Liu, and Ren-Hua Wang, "CDF-matching for automatic tone error detection in mandarin call system," in Proc. ICASSP, vol. 4, Honolulu, pp. IV–205-IV–208, 2007.

[9] Yow-Bang Wang and Lin-Shan Lee, "Mandarin tone recognition using affine-invariant prosodic features and tone posteriorgram," in Proc. INTERSPEECH,

Makuhari, Chiba, Japan, pp. 2850-2853, 2010.

[10] Lawrence R. Rabiner, Michael J. Cheng, AARON E. Rosenberg, and Carol A.

McGonegal, "A Comparative Performance Study of Several Pitch Detection Algorithms," IEEE Trans. Acoust., Speech, Signal Process., vol. 24, no. 5, pp. 399-418, Oct 1976.

[11] Lawrence R. Rabiner and Ronald W. Schafer, Digital Processing of Speech Signals.: Pearson Education, 1978.

[12] B. Gold and Lawrence R. Rabiner, "Parallel Processing Techniques for Estimating Pitch Periods of Speech in the Time Domain," Journal of the Acoustical Society of America, vol. 46, no. 2B, pp. 442-448, August 1969.

[13] Man Mohan Sondhi, "New methods of pitch extraction," IEEE Trans. Audio Electroacoust, vol. 16, no. 2, pp. 262-266, Jun 1968.

[14] John D. Markel, "The SIFT algorithm for fundamental frequency estimation,"

IEEE Trans. Audio Electroacoust, vol. 20, no. 5, pp. 367-377, Dec 1972.

[15] John J. Dubnowski, Ronald W. Schafer , and Lawrence R. Rabiner, "Real-time digital hardware pitch detector," IEEE Trans. Acoust., Speech, Signal Process, vol.

24, no. 1, pp. 2-8, Feb 1976.

[16] Byung Suk Lee and Daniel P. W. Ellis, "Noise robust pitch tracking by subband

autocorrelation classification," in Proc. INTERSPEECH, Portland, Oregon, USA, 2012.

[17] Jinsong Zhang and Keikichi Hirose, "Tone nucleus modeling for Chinese lexical tone recognition," Speech Communication, vol. 42, no. 3–4, pp. 447-466, Apr.

2004.

[18] Jin-Song Zhang, Satoshi Nakamura, and Keikichi Hirose, "Tone nucleus-based multi-level robust acoustic tonal modeling of sentential F0 variations for Chinese continuous speech tone recognition," Speech Communication, vol. 46, no. 3-4, pp.

440-454, July 2005.

[19] Fujisaki Hiroya, Keikichi Hirose, Pierre Halle, and Haitao Lei, "Analysis and modeling of tonal features in polysyllabic words and sentences of the standard chinese," in Proc. ICSLP, Kobe, 1990, pp. 841–844.

[20] Geoffrey E. Hinton, Simon Osindero, and Yee-Whye Teh, "A fast learning algorithm for deep belief nets," Neural Computation, vol. 18, no. 7, pp. 1527-1554, July 2006.

[21] Geoffrey Everest Hinton and Ruslan Salakhutdinov, "Reducing the dimensionality of data with neural networks," Science, vol. 313, pp. 504-507, July 2006.

[22] Neville Ryant, Jiahong Yuan, and Mark Liberman, "Mandarin tone classification

without pitch tracking," in Proc. ICASSP, Florence, Italy, pp. 4868-4872, 2014.

[23] Ye Tian, Jian-Lai Zhou, Chu Min, and Eric Chang, "Tone recognition with fractionized models and outlined features," in Proc. ICASSP, Montreal, 2004.

[24] Lawrence Rabiner, "On the use of autocorrelation analysis for pitch detection,"

Acoustics, Speech and Signal Processing, IEEE Transactions on, vol. 25, no. 1,

pp. 24-33, Feb. 1977.

[25] M. Ross, H. Shaffer, A. Cohen, R. Freudberg, and H. Manley, "Average magnitude difference function pitch extractor," Acoustics, Speech and Signal Processing, IEEE Transactions on, vol. 22, no. 5, pp. 353-362, 1974.

[26] Pegah Ghahremani et al., "A Pitch Extraction Algorithm Tuned for Automatic Speech Recognition," in Proc ICASSP, Florence, 2014.

[27] Fumitada Itakura, "Minimum prediction residual principle applied to speech recognition," Acoustics, Speech and Signal Processing, IEEE Transactions on, vol.

23, no. 1, pp. 67-72, Feb. 1975.

[28] Steven B. Davis and Paul Mermelstein, "Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences," IEEE Trans. Acoust., Speech, Signal Process, vol. 28, no. 4, pp. 357-366, Aug 1980.

[29] Chang-Han Hank Huang and Frank Seide, "Pitch tracking and tone features for

Mandarin speech recognition," in Proc. ICASSP, Istanbul, 2000.

[30] Lei He and Jie Hao, "A tone recognition framework for continuous mandarin speech," in Proc. ICSLP, Pittsburgh, Pennsylvania, pp. 1575-1578, 2006.

[31] F. Plante, G.F. Meyer, and W.A Ainsworth, "A pitch extraction reference database,"

in EUROSPEECH, Madrid, 1995.

[32] Kornel Laskowski, Matthias Wölfel, Mattias Heldner, and Jens Edlund,

"Computing the fundamental frequency variation spectrum in conversational spoken dialogue systems," in Proc Acoustics, Paris, 2008, pp. 3305-3310.

[33] Kornel Laskowski, Mattias Heldner Heldner, and Jens Edlund, "The fundamental frequency variation spectrum," in Proc. Fonetik, Gothenburg, pp. 29-32, 2008.

[34] Hao Chao, Zhanlei Yang, and Wenju Liu, "Improved tone modeling by exploiting articulatory features for mandarin speech recognition," in Proc. ICASSP, Kyoto, pp. 4741-4744, 2012.

[35] Pui-Fung WONG and Man-Hung Siu, "Decision tree based tone modeling for Chinese speech recognition," in Proc. ICASSP, vol. 1, 2004.

[36] 熊玉雯、宋曜廷, "華語學習者語音語料庫之建置與錯誤分析," 於語言特徵分析工作坊, 臺北, 2014.

[37] Hsin-Min Wang, Berlin Chen, Jen-Wei Kuo, and Shih-Sian Cheng, "MATBN: A

mandarin chinese broadcast news corpus," International Journal of Computational Linguistics & Chinese Language Processing, vol. 10, no. 2, pp. 219-236, June

2005.

[38] Malcolm Slaney, Elizabeth Shriberg, and Jui-Ting Huang, "Pitch-gesture modeling using subband autocorrelation change detection," in Proceedings of INTERSPEECH 2013, Lyon, pp. 1911-1915, 2013.

[39] Paul Boersma, "Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound," in Proc. IFA, Amsterdam, pp.

97-110, 1993.

[40] Arturo Camacho, SWIPE: A sawtooth waveform inspired pitch estimator for speech and music. Gainesville: University of Florida, 2007.

[41] Speech Signal Processing Toolkit (SPTK). [Online]. http://sp-tk.sourceforge.net/

[42] Daniel Povey et al., "The kaldi speech recognition toolkit," in Proc ASRU, Hawaii, 2011.

[43] Chih-Chung Chang and Chih-Jen Lin, "LIBSVM: A library for support vector machines," ACM Transactions on Intelligent Systems and Technology, vol. 2, no.

3, pp. 1-27, April 2011.

[44] Jinsong Zhang and Keikichi Hirose, "Tone nucleus modeling for chinese lexical

tone recognition," Speech Communication, vol. 42, no. 3–4, pp. 447–466, Apr.

2004.

[45] Shilei Zhang, Shi Qin, Stephen M. Chu, and Yong Qin, "Main vowel domain tone modeling with lexical and prosodic analysis for Mandarin ASR," in Proc. ICASSP, Taipei, pp. 4561-4564, 2009.

[46] Lei He and Jie Hao, "A tone recognition framework for continuous mandarin Speech," in Proc. INTERSPEECH, Pittsburgh, pp. 1575-1578, 2006.

[47] Dinoj Surendran and Gina-Anne Levow, "Can voice quality improve mandarin tone recognition?," in In Proc. ICASSP, pp. 4177-4180, 2008.

在文檔中聲調特徵擷取技術與其在中文聲調辨識應用之研究 (頁 55-61)