• 沒有找到結果。

第五章 實驗結果與分析

6.2 未來展望

從本研究可以延伸出四項議題值得未來進一步探討。第一,經由本研究之辨認結果 作分析後發現,詞錯誤率深受 OOVs 的影響,而絕大部分的 OOVs 屬於人名,所以未來 可以多加入一個專屬人名的語言模型來解決此一問題;第二,本研究中尚未利用到高層 次的語言參數,例如詞綴、片語或語法等,未來可以針對這些高層次語言參數重新設計 韻律模型,待加入到辨認系統後除了有望再次提升效能外,同時還能解碼出各測試音檔 之語法架構;第三,將本研究運用至其他語言(例如英文);第四,目前本研究由於只對 朗讀式語音作辨認,未來若能延展到更貼近生活化的自發性語音,相信對語音辨認領域 相信又是一重要研究貢獻。

參考文獻

【1】 S. Ananthakrishnan and S. Narayanan, “Unsupervised adaptation of categorical prosody models for prosody labeling and speech recognition,” IEEE Trans. on Audio, Speech and Language Processing, vol. 17, no. 1, pp. 138-149, Jan. 2009.

【2】 S. Ananthakrishnan and S. Narayanan, “Improved speech recognition using acoustic and lexical correlates of pitch accent in a n-best rescoring framework,” in Proc.of ICASSP 2007, pp. IV-873-IV876.

【3】 S. Ananthakrishnan and S. Narayanan, “Prosody-enriched lattices for improved syllable recognition,” in Proc. INTERSPEECH 2007, pp. 1813-1816.

【4】 K. Chen, M. Hasegawa-Johnson, A. Cohen, S. Borys, S.-S. Kim, J. Cole, and J.-Y.

Choi, “Prosody dependent speech recognition on radio news corpus of American English,” IEEE Trans. on Audio, Speech and Language Processing, vol. 14 no.1, pp.232-245, January 2006.

【5】 D. H. Milone and A. J. Rubio, “Prosodic and accentual information for automatic speech recognition,” IEEE Transactions on Audio, Speech and Language Processing, vol.

11, no. 4, pp. 321-333, July 2003.

【6】 D. Vergyri, A. Stolcke, V. R. R. Gadde, L. Ferrer, and E. Shriberg, “Prosodic knowledge sources for automatic speech recognition,” in Proc. ICASSP 2003, pp.

I-208-I-211.

【7】 M. Ostendorf, I. Shafran, and R. Bates, “Prosody models for conversational speech recognition,” in Proc. 2nd Plenary Meeting Symp. Prosody and Speech Process 2003, pp.

147-154.

【8】 X. Lei and M. Ostendorf, “Word-level tone modeling for Mandarin speech recognition,” in Proc. ICASSP 2007, pp. IV-665-IV-668.

【9】 C. Ni, W. Liu, and B. Xu, “Improved large vocabulary Mandarin speech recognition using prosodic and lexical information in maximum entropy framework,” in Proc. of CCPR 2009.

【10】 W.-J. Wang, Y.-F. Liao, and S.-H. Chen, “Prosodic modeling of Mandarin speech and its application to lexical decoding,” in Proc. EUROSPEECH, 1999, vol. 2, pp.

743-746.

【11】 Y. Liu, E. Shriberg, A. Stolcke, D. Hillard, M. Ostendorf, and M. Harper,

“Enriching speech recognition with automatic detection of sentence boundaries and disfluencies,” IEEE Transactions on Audio, Speech and Language Processing, vol. 14, no. 5, pp. 1526-1540, September 2006.

【12】 E. Shriberg and A. Stolcke, “Prosody modeling for automatic speech recognition and understanding,” in Proc. workshop on mathematical foundations of natural language modeling 2002.

【13】 K. Silverman, M. Beckman, J. Pitrelli, M. Ostendorf, C. Wightman, P. Price, J.

Pierrehumbert, and J. Hirschberg, “ToBI: A standard for labeling English prosody,” in Proc. ICSLP, 1992, vol. 2, pp. 867-870.

【14】 C.-Y. Chiang, S.-H. Chen, H.-M. Yu, and Y.-R. Wang, “Unsupervised joint prosody labeling and modeling for Mandarin speech,” Journal of the Acoustic Society of America, vol. 125, no. 2, pp.1164-1183, Feb. 2009.

【15】 C.-Y. Tseng, S.-H. Pin, Y.-L. Lee, H.-M. Wang, and Y.-C. Chen, “Fluent speech prosody: Framework and modeling,” Speech Communication, 46, pp. 284-309, 2005.

【16】 L. Breiman, J. Friedman, R. Olshen, and C. Stone, “Classification and Regression Tree,” Wadsworth, Belmont, 1984.

【17】 S.-H. Chen and Y.-R. Wang, “Vector quantization of pitch information in Mandarin speech,” IEEE Transactions on Communications, vol. 38, no. 9, pp. 1317-1320, September 1990.

【18】 J. A. Bilmes and K. Kirchhoff, “Factor language models and generalized parallel backoff,” in Proc. of HLT/NACCL, 2003, pp. 4-6.

【19】 A. Stolcke, “SRILM – An extensible language modeling toolkit,” in Proc. ICSLP, 2002.

【20】 P. Beyerlein, “Discriminative model combination,” in Proc. ICASSP 1998, pp.

481-484.

【21】 B.-H. Juang, W. Chou, C.-H. Lee, “Statistical and discriminative methods for speech recognition”, in Speech Recognition and Coding - New Advances and Trends, ed.

A.J. Rubio Ayuso, J.M. Lopez Soler, Springer-Verlag, Berlin-Hheidelberg, 1995.

【22】 Mandarin microphone speech corpus – TCC300, http://www.aclclp.org.tw/use_mat.php#tcc300edu.

【23】 “HTK Web-Site”, http://htk.eng.cam.ac.uk. Accessed 2009

【24】 L.R. Bahl, R. F. Brown, P. V. de Souza, and R.L. Mercer, “Maximum mutual information estimation of hidden markov model parameters for speech recognition,” in Proc. ICASSP 1986, pp. 49-52.

【25】 C.-R. Huang, K.-J. Chen, F.-Y. Chen, Z.-M. Gao and K.-Y. Chen. 2000, “Sinica treebank: design criteria, annotation guidelines, and on-line interface”, in Proceedings of 2nd Chinese Language Processing Workshop 2000, Hong Kong, pp. 29-37.

【26】 S.-H. Chen, W.-H. Lai, and Y.-R. Wang, “A statistics-based pitch contour model for Mandarin speech,” Journal of the Acoustical Society of America, vol. 117, no. 2, pp.

908–925, February 2005

【27】 S.-H. Chen, W.-H. Lai, and Y.-R. Wang, “A new duration modeling approach for Mandarin speech,” IEEE Transactions on Audio, Speech and Language Processing, vol.

11, no. 4, pp. 308–320, July 2003.

【28】 Z. Sheng, J.-H. Tao, and D.-L. Jiang,“Chinese prosodic phrasing with extended features,"Proceedings of the IEEE ICASSP 2003, Vol. 1, pp.492-495, 2008

【29】 C.-Y. Tseng, S.-H. Pin, Y.-L. Lee. H.-M. Wang, and Y.-C Chen, “Fluent speech prosody:Framwork and modeling,"Speech Commun. Special issue on quantitive prosody modeling for natural speech description and generation, 46, 284-309 2005

【30】 F. Sha and F. Pereira. Shallow parsing with conditional random fields.

【31】 周建邦, “中文大詞彙語音辨認知語言模型改進”, 國立交通大學碩士論文, 民 國九十八年十二月。

【32】 張皓翔, “使用階層式韻律模型於豐富中文語音辨認”, 國立交通大學碩士論文, 民國九十九年八月。

附錄一:決策樹之問題集

The question set used to construct the decision trees for building the break syntax modelP B l( n| )n and (P pd edn, n,pj dl dfn, n, n|B ln, )n is listed below:

' Is the inter-syllable location an utterance boundary?' ' Is the inter-syllable location an interword?'

' Does a PM exist at the inter-syllable location'

' Does a Major PM exist at the inter-syllable location ' ' Does a。exist at the inter-syllable location '

' Does a,exist at the inter-syllable location ' ' Does a、exist at the inter-syllable location ' ' Does a.exist at the inter-syllable location ' ' Does a;exist at the inter-syllable location ' ' Does a:exist at the inter-syllable location ' ' Does a?exist at the inter-syllable location ' ' Does a!exist at the inter-syllable location ' ' Does a(exist at the inter-syllable location ' ' Does a)exist at the inter-syllable location '

' Is the the preceding special prefix words + special 1-syllable words: Ng, Ncd, Di, DE, I, T' ' Is the POS of the preceding word A'

' Is the POS of the preceding word C' ' Is the POS of the preceding word D' ' Is the POS of the preceding word N' ' Is the POS of the preceding word I or T' ' Is the POS of the preceding word P' ' Is the POS of the preceding word V'

' Is the POS of the preceding word DE' ' Is the POS of the preceding word SHI' ' Is the POS of the preceding word FW' ' Is the POS of the preceding word DM'

' Is the POS of the preceding word Da Di Dk D' ' Is the POS of the preceding word Dfa'

' Is the POS of the preceding word Dfb'

' Is the POS of the preceding word Na Nb Nc Nv' ' Is the POS of the preceding word Nd'

' Is the POS of the preceding word Neu Nes Nep Neqa Neqb Nf' ' Is the POS of the preceding word Ng Ncd'

' Is the POS of the preceding word Nh'

' Is the POS of the preceding word VA VAC VG'

' Is the POS of the preceding word VB VC VCL VD VE VF VJ VK VL' ' Is the POS of the preceding word VH VHC VI'

' Is the POS of the preceding word V_2' ' Is the POS of the preceding word Caa' ' Is the POS of the preceding word Cab' ' Is the POS of the preceding word Cba' ' Is the POS of the preceding word Cbb' ' Is the POS of the preceding word Da' ' Is the POS of the preceding word Di' ' Is the POS of the preceding word Dk' ' Is the POS of the preceding word D' ' Is the POS of the preceding word Na' ' Is the POS of the preceding word Nb'

' Is the POS of the preceding word Nc' ' Is the POS of the preceding word Ncd' ' Is the POS of the preceding word Neu' ' Is the POS of the preceding word Nes' ' Is the POS of the preceding word Nep' ' Is the POS of the preceding word Neqa' ' Is the POS of the preceding word Neqb' ' Is the POS of the preceding word Nf' ' Is the POS of the preceding word Ng' ' Is the POS of the preceding word Nv' ' Is the POS of the preceding word I' ' Is the POS of the preceding word T' ' Is the POS of the preceding word VA' ' Is the POS of the preceding word VAC' ' Is the POS of the preceding word VB' ' Is the POS of the preceding word VC' ' Is the POS of the preceding word VCL' ' Is the POS of the preceding word VD' ' Is the POS of the preceding word VE' ' Is the POS of the preceding word VF' ' Is the POS of the preceding word VG' ' Is the POS of the preceding word VH' ' Is the POS of the preceding word VHC' ' Is the POS of the preceding word VI' ' Is the POS of the preceding word VJ' ' Is the POS of the preceding word VK'

' Is the POS of the preceding word VL' ' Is the length of the preceding word 1' ' Is the length of the preceding word 2' ' Is the length of the preceding word 3' ' Is the length of the preceding word 4' ' Is the length of the preceding word 5' ' Is the length of the preceding word 6'

' Is the length of the preceding word less than 2' ' Is the length of the preceding word less than 3' ' Is the length of the preceding word less than 4' ' Is the length of the preceding word less than 5' ' Is the length of the preceding word less than 6'

' Is the following special 1-syllable words: Ng, Ncd, Di, DE, I, T + special suffix words' ' Is the POS of the following word A'

' Is the POS of the following word C' ' Is the POS of the following word D' ' Is the POS of the following word N' ' Is the POS of the following word I or T' ' Is the POS of the following word P' ' Is the POS of the following word V' ' Is the POS of the following word DE' ' Is the POS of the following word SHI' ' Is the POS of the following word FW' ' Is the POS of the following word DM'

' Is the POS of the following word Da Di Dk D' ' Is the POS of the following word Dfa'

' Is the POS of the following word Dfb'

' Is the POS of the following word Na Nb Nc Nv' ' Is the POS of the following word Nd'

' Is the POS of the following word Neu Nes Nep Neqa Neqb Nf' ' Is the POS of the following word Ng Ncd'

' Is the POS of the following word Nh'

' Is the POS of the following word VA VAC VG'

' Is the POS of the following word VB VC VCL VD VE VF VJ VK VL' ' Is the POS of the following word VH VHC VI'

' Is the POS of the following word V_2' ' Is the POS of the following word Caa' ' Is the POS of the following word Cab' ' Is the POS of the following word Cba' ' Is the POS of the following word Cbb' ' Is the POS of the following word Da' ' Is the POS of the following word Di' ' Is the POS of the following word Dk' ' Is the POS of the following word D' ' Is the POS of the following word Na' ' Is the POS of the following word Nb' ' Is the POS of the following word Nc' ' Is the POS of the following word Ncd' ' Is the POS of the following word Neu' ' Is the POS of the following word Nes' ' Is the POS of the following word Nep' ' Is the POS of the following word Neqa'

' Is the POS of the following word Neqb' ' Is the POS of the following word Nf' ' Is the POS of the following word Ng' ' Is the POS of the following word Nv' ' Is the POS of the following word I' ' Is the POS of the following word T' ' Is the POS of the following word VA' ' Is the POS of the following word VAC' ' Is the POS of the following word VB' ' Is the POS of the following word VC' ' Is the POS of the following word VCL' ' Is the POS of the following word VD' ' Is the POS of the following word VE' ' Is the POS of the following word VF' ' Is the POS of the following word VG' ' Is the POS of the following word VH' ' Is the POS of the following word VHC' ' Is the POS of the following word VI' ' Is the POS of the following word VJ' ' Is the POS of the following word VK' ' Is the POS of the following word VL' ' Is the length of the following word 1' ' Is the length of the following word 2' ' Is the length of the following word 3' ' Is the length of the following word 4' ' Is the length of the following word 5'

' Is the length of the following word 6'

' Is the length of the following word less than 2' ' Is the length of the following word less than 3' ' Is the length of the following word less than 4' ' Is the length of the following word less than 5' ' Is the length of the following word less than 6'

Is the initial of the following syllable a null one or in { m, n, l, r}?

Is the initial of the following syllable a null one or in { b, d, g}?

Is the initial of the following syllable a null one or in { f, s, sh, h}?

Is the initial of the following syllable a null one or in { c, ch, q}?

Is the initial of the following syllable a null one or in { p, t, k}?

Is the initial of the following syllable a null one or in { z, zh, j}?

相關文件