• 沒有找到結果。

Conclusion and Future Work

In this thesis, we describe a web-based approach to deal with proper noun translation by mining from search-result pages. First, we add expansion terms to source query, and then retrieval snippets by expanded query. Second, translation candidate strings are extracted if they matched the surface patterns, and then we generate translation candidates from translation candidate strings. Finally, the proposed formula is used to rank translation candidates. From the experiments, our approach has a good performance for finding translations of proper nouns through Web resources. We summarize the contributions as follows:

1. We integrate some improved ways to enhance efficiency of proper noun translation. From the experiments show that our proposed method has good work.

2. We proposed a statically web-based query expansion method. Most of proper nouns are out-of-vocabulary terms, so we proposed web-based method can overcome the lack of resources to generate expansion terms.

3. We proposed a new formula on the basis of word length, word frequency, and distance distribution.

5.2 Future Work

Future work we will focus on sub-query translation. This approach may deals with the error cause from few numbers of returned snippets or not enough bilingual information, especially for long queries. Form example, we submit source query “李 登輝學校” to search engine, but we can not get any information of its translation

“Lee Teng-Hui Academy” from returned snippets. Therefore, we will segment the

source query into “李登輝” and “學校” and translate them respectively. Finally, we can exploit word sense disambiguation technique and some composition rules to merge them and get the final answer.

References

Cheng, P. J., Teng, J. W., Chen, R. C., Wang J. H., and Lu, W. H., “Translating unknown queries with web corpora for cross-language information retrieval.” In the Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 146-153, Sheffield, United Kingdom, 2004.

Fang, G., Yu, H., and Nishino, F., “Web-Based Terminology Translation Mining.” In the International Joint Conference on Natural Language Processing, pp. 1004-1016, Jeju Island, Korea, 2005.

Gao, J., Nie, J. Y., Xun, E., Zhang, J., Zhou, M., and Huang, C., “Improving query translation for cross-language information retrieval using statistical models.“ In the Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 96-104, New Orleans, Louisiana, United States, 2001.

Huang, F., Zhang, Y. and Vogel, S., “Mining Key Phrase Translations from Web Corpora.” In the Proceedings of the Human Language Technologies Conference (HLT-EMNLP 2005), pp. 483–490, Vancouver, BC, Canada, October 2005.

Kishida, K., Kando, N., and Chen, K. H., “Two-Stage Refinement of Transitive Query Translation with English Disambiguation for Cross-Language Information Retrieval:

An Experiment at CLEF 2004”. In the CLEF 2004, pp. 135-142, Bath, UK, 2004.

Lam, W., Chan, K., Radev, D., Saggion, H., and Teufel, S., “Context-based generic cross-lingual retrieval of documents and automated summaries: Research Articles.” In the Journal of the American Society for Information Science and Technology, pp.

129-139, January 2005.

Lee, C. J., Chang, J. S., and Jyh-Shing Roger Jang, "Extraction of Transliteration Pairs from Parallel Corpora Using a Statistical Transliteration Model." In the Information Sciences, 2005.

Li, S. and Ng, H. T., “Mining New Word Translations from Comparable Corpora.” In the Proceedings of the 20th International Conference on Computational Linguistics, pp. 618-624, University of Geneva, Geneva, Switzerland, 2004.

Lin, F. and Mitamura, T., “Keyword Translation from English to Chinese for Multilingual QA.” In the AMTA 2004, pp. 164-176, 2004.

Liu, Y., Jin, R., and Chai, J. Y., “A maximum coherence model for dictionary-based cross-language information retrieval.” In Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, pp.

536-543, Salvador, Brazil, 2005.

Lu, W. H., Chien, L. F., Lee, H. J., "Anchor Text Mining for Translation of Web Queries." In the Proceedings of the 2001 IEEE International Conference on Data Mining (ICDM), pp. 401-408, 2001.

Lu, W. H., Lee, H. J., Chien, L. F., “Anchor Text Mining for Translation Extraction of Query Terms.” In the SIGIR, pp. 388-389, 2001.

Nie, Yun, J., Simard, M., Isabelle, P., Durand, R., “Cross-Language Information Retrieval Based on Parallel Texts and Automatic Mining of Parallel Texts from the Web.” In the proceedings of ACM-SIGIR, pp. 74—81, 1999

R. Baeza-Yates and B. Ribeiro-Neto, “Modern Information Retrieval.” Addison- Wesley & ACM Press, Harlow, UK, 1999.

Rapp, R., “Identifying word translations in non-parallel texts.” In the Proceedings of the 33rd annual meeting on Association for Computational Linguistics, pp. 320-322, Cambridge, Massachusetts. 1995.

Seo, H. C., Kim, S. B., Rim, H. C., Myaeng S. H., “Improving query translation in English-Korean cross-language information retrieval.” In the Information Processing and Management: an International Journal, pp. 507-522, 2005.

Van Rijsbergen, D. J, “Information retrieval, 2nd. ed.Butterworths.” London, 1979.

Wang, J. H., Teng, J. W., Lu, W. H., and Chien, L. F., “Exploiting the Web as the multilingual corpus for unknown query translation.” In the Journal of the American Society for Information Science and Technology, pp. 660-670, 2006.

Wu, D. and Xia, X., "Learning an English-Chinese lexicon from a parallel corpus". In the AMTA-94: Assoc. for Machine Translation, pp. 206-213. Columbia, MD, October 1994.

Wu, J. C., Lin, T., and Chang, J. S., "Learning Source-Target Surface Patterns for Web-Based Terminology Translation." In proceedings of the ACL Interactive Poster and Demonstration Sessions, pp. 37-40, Ann Arbor, June 2005.

Zhang, Y. and Vines, P., “Using the Web for Automated Translation Extraction in Cross-Language Information Retrieval.” In Proceedings of 27th ACM SIGIR,

pp.162-169, Sheffield, United Kingdom, 2004.

Zhang, Y., Huang, F., and Vogel, S., “Mining Translations of OOV Terms from the Web through Cross-lingual Query Expansion.” In the Proceedings of the 28th ACM SIGIR, Salvador, Brazil, August 2005.

相關文件