• 沒有找到結果。

本論文提出並實際製作了一個中文語篇自動標記系統,經實驗數據 的分析顯示,能正確的標記出並列、遞進、轉折等九類語篇連貫關係。

本論文從研究、設計到製作,可以歸納出幾個主要的成果與貢獻:

1.針對語料中的語篇線索詞進行研究與觀察,使資訊科學研究者可 以更了解線索詞在實際語料中的分布特性,有助於後續研究的進 行。

2.利用語篇的分布特性進行初步的探勘,並驗證我們所提出的抽取 演算法之效能,的確可以幫助我們抽取出可堪使用的語篇線索詞。

3.運用可擴充的規則模組,成功的在實際語料中標記出九種語篇連 貫關係,可節省大量人工標記的時間成本。

4.完成中文語篇自動標記系統之雛形,並可運用於中文作文自動批 改系統,以辨識學生的語篇運用能力。

本論文的後續研究有下列幾個方向:

1.未知線索詞的自動擴充

可利用同義詞或近義詞,搭配連結強度來自動抽取更多的線索 詞,以提高系統的資料涵蓋率。

2.輔助特徵的研究

由於語篇的結構有時十分複雜,因此需要找尋更多的輔助特 徵,來協助系統標記語篇。

3.未知語篇的定義與研究

可進行更多位之語篇的定義與研究,以利提高系統的資料涵蓋 率。

4.建立語篇自動辨識模型

可利用機器學習及建立語義概念網路的方式,來幫助系統辨識 語義的轉折,並可利用統計模型來進行語篇的自動辨識。

參考文獻

楊遠(1962),標點符號研究,香港天健出版社,頁 3,15-31

曹逢甫(1995),主題在漢語中的功能研究(A functional study of topic in chinese:The first step toward discourse analysis):謝天蔚譯,北京語 文出版社

鄭守益,梁婷, 中文句子相似度之計算與應用,第十七屆自然語言與語音 處理研討會, Tainan, Taiwan, 2005 Proceedings of ROCLING XVII pp.

113-124.

Allen, J., ( 1995), Natural Language Understanding, 2nd, Benjamid/Cummings.

Burstein, J., Kukich, K., Wolff, S., Lu, C. and Chodorow, M. (1998),

“Enriching Automated Scoring Using Discourse Marking”, In the Proceedings of the Workshop on Discourse Relations & Discourse Marking, Annual Meeting of the Association of Computational Linguistics, August, 1998.

Chan, W. K., Lai, B. Y., Gao, W. J. and T'sou, K.,(2000), "Mining Discourse Markers for Chinese Textual Summarization." In Proceedings of the 6th Applied Natural Language Processing Conf. and the North American Chapter of the Association for Computational Linguistics. Workshop on Automatic Summarization,Seattle, Washington, 29 April to 3 May.

Dong, Z. D., Dong, Q., (1999), “HowNet“, http://www.keenage.com

Grosz, B. J. and C: L. Sidner,(1986), “Attention, intentions, and the structure of discourse”, Computational Linguistics, vol. 12, no. 3, pp. 175-204.

Grosz, B. J., A. K. Joshi, and S. Weinstein,( 1995), “Centering: a framework for modeling the local coherence of discourse”, Computational Linguistics, vol. 21, no. 2, pp. 203-225.

Guenthner, F., H. Lehmann, and W. Schonfeld,( 1986), “A theory for the

representation of knowledge”, JBM Journal of Research and Development, vol. 30, no. 1, pp. 39-56,January.

Hirschberg, J. and D. Litman,( 1993), “Empirical studies on the disambiguation of cue phrases” Computational Linguistics, vol. 19, no.

3, pp. 501-530.

Hobbs, J. R. Literature and Cognition, (1990), CSLI Press, Stanford, California.

Hovy, E. and E. Maier.,(1995),”Parsimonious or profligate: How many and which discourse relations?”, Technical report, University of Southern California.

Hsu, W. L., Y. S. Chen, and Y. K. Wang,( 1998), “A context sensitive model for concept understanding”, Proceedings of Third Int. Conf. on Information-Theoretic Approachesto Logic, Language, and Computation.

Hearst, M. A.,( 1997),”TextTiling: Segmenting Text into Multi-paragraph Subtopic Passages”, Computational Linguistics, 23 (1), 33-64, March.

Halliday, M. A. K. & Hasan, R., Coherence in English, London:

Longman.,1976

Kamp, H.,( 1981), “A theory of truth and semantic representation: Formal Methods in the Study of Language”, MC TRACT 135, J. A. G.

Groenendijk, T. M. V. Janssen, and M. B. J. Stokhof (Eds.), Amsterdam, p. 277.

Lin, K. H. C. and V. W. Soo,(1993), “Toward discourse-guided theta-grid parsing for Mandarin Chinese --a preliminary report”, Proceedings of ROCLING Il, pp. 259-270.

Li, S., Zhang, J.,(2002), “Semantic Computation in Chinese Question-Answering System”, Journal of Computer Science and Technology, 17(6): 933

Marcu, D., (2000), “The rhetorical parsing of unrestricted texts: A surface-based approach.”, Computational Linguistics 26: 395-448

Sadao K., Makoto N.,(1994), “Automatic Detection of Discourse Structure by Checking Surface Information in Sentences”, COLING , pp.1123-1127

Smadja, F., (1993), ”Retrieving collocations from text: Xtract”, Computational Linguistics, 19(1): 143-177

Tomohide S. and Sadao K.,(2005), "Automatic Slide Generation Based on Discourse Structure Analysis", In Proceedings of Second International Joint Conference on Natural Language Processing (IJCNLP-05),

pp.754-766, Jeju Island, Korea.

Wang, Y. K., Y. S. Chen, and W. L. Hsu, (1998), “Empirical study of Mandarin Chinese discourse analysis: an event-based approach,” to appear in 10th IEEE Int’l Conf. on Tools with Artificial Intelligence (ICTAI’98).

Wolf, F. and Gibson, E.,( 2005 ),”Representing discourse coherence: A corpus-based analysis”, Computational Linguistics, 31(2): 249-287.

相關文件