QA Experiment on Video OCR - QA on Video Films

Chapter 4 Open-Domain Question Answering on Heterogeneous Data

6. QA on Video Films

6.3 QA Experiment on Video OCR

The test data come from 19 Discovery programs, about 298KB, which are pronounced in English and with Chinese captions. Each program is about 1 hour long.

Total 15,353 lines of captions are extracted. On average, there are 808 lines in a program.

A basic information unit of a film is considered as a passage for extracting answers.

The passages are segmented at pause duration longer than 5 seconds in our experiments.

There are totally 3,876 passages. On average, there are 204 passages in a program.

39 questions are collected from the web site of Discovery Channel (http://chinese.

discovery.com/sch/), as links in the program list page. They offer questions according to each program for educational purpose. The QA results are not good. Original model

answered 3 questions, while OCR-similarity- integrated model answered one more. The main reason is that these questions are pretty hard in question level (Moldovan, et al., 2000).

7. Conclusion

This paper sketches a new view of question answering on heterogeneous data.

Table 3 compares the heterogeneous data in QA task. After defining information passages and similarity measurement, our QA system is capable of handling data consisting of plain texts, summaries, HTML documents with tables, and videos.

Table 3. Comparison of Heterogeneous Data

Plain text Summary Table Video

Document Document Document and Table Film, Captions Sentence, Passage Sentence, Passage Interpretation,

Value-Cells

Film Fragment Divided by Pause Lexical Matching Lexical Matching Lexical Matching Lexical Matching and

OCR Similarity Presented as Text Text Text or Tables Film Fragment

There are several interesting future directions, for example, how query-based summarization can be helpful a QA task, how to integrate the context of tables, and so on.

Besides, background linguistic technologies for OCR texts, such as word segmentation, IR, and named entity extraction, have to be redefined.

Appendix A.

(1) Question Foci

PERSON, LOCATION, TIME, QUANTITY, SELECTION, METHOD, DESCRIPTION, REASON, and OBJECT.

(2) Hand-Tagged Questions

These are some examples of hand-tagged questions for training Question-Focus

decision rules. Boxed texts are question words. A question focus is given in front of each question, and is printed in bold.

LOCATION

(Where is Grass Valley?) TIME

(When did Taiwan history start?) METHOD

(How to improve the absorption of Calcium?)

Appendix B.

Question Focus Decision Rules

These are some examples of Question-Focusdecision rules. “Term” isthequestion word found in the sentence, and TermNext (TermPrev) is the term following (preceding) the question word.

Rule 3: Term= (where)-> class LOCATION Rule 17: Term = (who)-> class PERSON

Rule 21: Term = (how), TermNext = (to come, to do)-> class METHOD

Appendix C.

Chinese Questions for Experiments on Plain Text and Summarization.

Q1.

(Where does the first sunlight shine on China?) Q77.

(Whatkind ofgameis“TheHero”?) Q280. 21

(What will be the star industry in the 21^stcentury?)

References

Chang, C.Y. (1997) A Discourse Analysis of Questions in Mandarin Conversation, Master Thesis, National Taiwan University, June 1997.

Chen,H.H.and Huang,S.J.(1999)“A Summarization System forChineseNewsfrom MultipleSources,”Proceedings of 4^thIRAL, Taiwan, pp. 1-7, 1999.

Chen,H.H.,Tsai,S.C.,and Tsai,J.H.(2000)“Mining Tables from Large Scale HTML Texts,”Proceedings of 18th COLING, pp. 166-172, 2000.

Chen, H.H. and Lin, C.J. (2000) "A Multilingual News Summarizer," Proceedings of 18^th COLING, pp. 159-165, 2000.

Chen,K.J.,Huang,C.R.,Chang, L.P.,and Hsu,H.L.(1996) “SinicaCorpus:Design Methodology for Balanced Corpora,” Proceedings of the 11^th PACLIC 11, pp.

167-176, 1996.

Pu-Jen Cheng, Jei-Wen Teng, Ruei-Cheng Chen, Jenq-Haur Wang, Wen-Hsiang Lu, and Lee-Feng Chien (2004) “Translating Unknown Queries with Web Corpora for Cross-Language Information Retrieval,”Proceedings of the 27^th ACM-SIGIR, pp.

146-153.

Christiane Fellbaum (Ed.) (1998) WordNet: An Electronic Lexical Database, The MIT Press, 1998.

Sanda Harabagiu, Dan Moldovan, Marius Pasca, Rada Mihalcea, Mihai Surdeanu, Razvan Bunescu, Roxana Girju, Vasile Rus, and Paul Morarescu (2001), “The Role of Lexico-Semantic Feedback in Open-Domain Textual Question Answering,”the Proceedings of the 39^thACL and 10^thEACL, pp. 274-281, 2001.

Sanda Harabagiu, Marius Pasca, and Steve Maiorano (2000), “Experiments with Open-Domain Textual Question Answering,”the Proceedings of the 18^th COLING, pp. 292-298, 2001.

LynetteHirschman and R.Gaizauskas(2001)“NaturalLanguageQuestion Answering:

theView from Here,”NaturalLanguage Engineering,CambridgeUniversity Press, Vol. 7, No. 4, 2001, pp. 275-300.

Eduard Hovy, Ulf Hermjakob, and Chin-Yew Lin (2001), “The Use of External Knowledge in Factoid QA,”the proceedings of TREC 2001, pp. 644-652, 2001.

Hovy, E. and Marcu, D. (1998a) Automated Text Summarization, Tutorial in 17^th COLING-ACL, Montreal, Quebec, Canada, 1998.

Hovy, E. and Marcu, D. (1998b) Multilingual Text Summarization, Tutorial in AMTA-98, 1998.

Hurst,M.(1999)“Layoutand Language:A CorpusofDocumentsContaining Tables,” Proceedings of AAAI Fall Symposium, 1999.

Hurst,M.and Douglas,S.(1997)“Layoutand Language:Preliminary Experimentsin Assigning LogicalStructureto TableCells,”ProceedingsofANLP ‘97,pp. 217-220, 1997.

Lin, C.J. and Chen, H.H., “Description of NTU System at TREC-9 QA Track,”

Proceedings of The Ninth Text REtrieval Conference (TREC-9), 2000, pp. 389-406.

Liu, C.C. (2001) Video OCR and Video Search, Master Thesis, National Taiwan University, 2001.

Mani,I.and Bloedorn,E.(1997)“Multi-document Summarization by Graph Search and Matching,” Proceedings of 4^th National Conference on Artificial Intelligence, Providence, pp. 622-628.

Moldovan, D., Harabagiu, S., Pasca, M., Mihalcea, R., Girju, R., Goodrum, R., Rus, V.

(2000)“The Structureand Performance of an Open-Domain Question Answering System,”Proceedings of 38^thACL, pp. 563-570, October 2000.

Ng, H.T.; Lim, C.Y. and Koo, J.L.T. (1999) “Learning to Recognize Tables in Free Text,”Proceedingsof37^thACL, pp. 443-450, 1999.

Quinlan, J.R. (1993) C4.5: Programs for Machine Learning, Morgan Kauffman, 1993.

Deepak Ravichandran and Eduard Hovy (2002), “Learning Surface TextPatternsfora Question Answering System,”the Proceedings of ACL, 2002.

Radev,D.R.and McKeown,K.R.(1998)“Generating NaturalLanguage Summariesfrom Multiple On-LineSources,”Computational Linguistics, Vol. 24, No. 3, pp. 469-500, 1998.

Sato,T.,Kanade,T.,Hughes,E.K.,Smith,M.A.,and Satoh,S.(1999)“Video OCR:

Indexing Digital News Libraries by Recognition of Superimposed Captions,” Multimedia Systems, Vol. 7, pp. 385-394, 1999.

Singhal,A.,Abney,S.,Bacchiani,M.,Collins,M.,Hindle,D.,Pereira,F.(1999)“AT&T at TREC-8,”Proceedings of TREC 8, Gaithersburg, pp. 317-330, November 1999.

M. M. Soubbotin (2001), “PatternsofPotentialAnswerExpressions as Clues to the Right Answers,”the Proceedings of TREC 2001, pp. 293-302, 2001.

Ellen Voorhees (2000) “QA Track Overview (TREC) 9,”[on-line] Available:

http://trec.nist.gov/ presentations/TREC9/qa/index.htm

Ellen Voorhees (2001) “Overview of the TREC 2001 Question Answering Track,”the Proceedings of TREC-10, pp. 42-51, 2001.

Ellen Voorhees (2002) “Overview of the TREC 2002 Question Answering Track,”

Proceedings of the Eleventh Text Retrieval Conference, Gaithersburg, Maryland, November 19-22, 2002.

在文檔中問答系統技術研發(3/3)－異質資訊源問答系統之研究 (頁 52-57)