The Question-Detection Problem - 漢語問句偵測之量化研究

The question-detection problem is, in short, to enable computers to detect the question parts, if any, within a stream of text or utterance. Its importance is twofold: linguistic and computer science perspectives. This section will first dis-cuss the issue from the linguistic point of view, and then, from the computer science point of view, enumerate applications that can benefit from the study of question-detection problem:

Human-computer communication.

Computer-computer communication.

Punctuation processing.

1.2.1 Question: A Linguistic View

From the linguistic science perspective, the study of speech acts has been a hot topic in discourse analysis, and “question” is one of the major illocutionary acts occurred in everyday life. The deeper our understanding of the nature of a variety of question expressions in particular, the better we may form a computational linguistics model for speech acts in general, which in turn improves application of linguistics.

What do we mean by the term question, anyway? In Glossary of Linguistic Terms [35], question has two senses:

1. An illocutionary act that has a directive illocutionary point of attempting to get the addressee to supply information.

2. A sentence type that has a form (labeled interrogative) typically used to ex-press an illocutionary act. It may be actually so used (as a direct illocution), or used rhetorically.

Obviously they reflect two main competitive schools of thought in linguistics:

the first addresses the functional facet, while the second addresses the formal facet. From the functional perspective, the following two cases are both questions in spite of totally different surface forms:

(1) a. Tell me your age.

b. How old are you?

As for the formal perspective, there are roughly three types of questions: in-terrogative, dubitative, and rhetorical questions. For example,

(2) a. What is this? interrogative

b. Can such a diligent student fail the school entrance exams? dubitative

c. Don’t you understand me? rhetorical

1.2.2 Human-computer Communication

As for human-computer communication, a non-toy human-computer dialogue or question answering system needs to distinguish between background information and foreground queries in order to behave more like humans. In such systems, therefore, earlier stages should include at least the question detection module;

subsequent processing is fragile without considering it. Now let’s examine the two applications in detail.

Question answering (QA) is a fast-growing sub-task of text retrieval. Given a query, it tries to pinpoint the specific answers (noun phrases, sentences, or short passages) rather than just give a pile of relevant documents for you to browse.

The QA track of Text REtrieval Conference (TREC) is one of the most famous

example. Since the first QA track initiated in 1999 (TREC-8), the has been a lot of progress in this field (see [50, 51, 52, 53, 54]). Participants in this track are required to give an exact answer in response to a factoid question, a list of exact answers to a list question, and a short passage to a definition question. Look at the following excerpts from TREC QA tracks:

(3) a. What is the longest river in the United States? factoid

b. Name the highest mountain. factoid

c. What are 5 books written by Mary Higgens Clark? list

d. List the names of chewing gums. list

e. Name 22 cities that have a subway system. list

f. Who is Colin Powell? definition

g. What are polymers? definition

As reported, most QA systems first classify an incoming question into various types of query focus (e.g., quantity, name, time, and place) as suggested by its question word (e.g., what and who) or imperative verb (e.g., list and name); the expected answer types can also be predicted accordingly. Next, some systems attempt a full understanding of the text and then use logic proofs or so to verify candidate answers (e.g., [44]); still others just attempt a shallow, data-driven pattern matching against candidate answers (e.g., [33, 48]).

There is at least one limitation of these QA systems, however. They assume that a QA system receives and recognizes only canonical query forms beginning with a question word or imperative verb. But in reality, not all questions fall into this category. Take the following real-world query for example.¹ Imagine that you are asking a QA system for troubleshooting:

(4) I have installed and configured Wine, but Wine cannot find MS Windows on my drive. Where did I go wrong?

1This paragraph is excerpted from The Wine FAQ. URL: http://www.winehq.com/site/

docs/wine-faq/index.

It is hard to imagine that you are allowed to tell the program only the latter half

“Where did I go wrong?” without the former “I have . . . on my drive.” Even if the unrealistic assumption was made, no program is smart enough to be able to answer the sole question “Where did I go wrong?”— the query focus is correctly identified as “where” but it is of little use here without preceding sentences. What is worse, the query focus “where” may mislead the program to an irrelevant direc-tion of physical places! As a result, if the QA program fails to distinguish between foreground query and surrounding context, how can it work out a search plan to answer your “where” question?²

Things become even more complicated in dialogue system, in which conversa-tion continues rather than just happens in one round, turn-taking is frequent, and a mixture of various speech acts such as illocutionary and perlocutionary may also be used freely [14]. Since natural conversation switches between both foreground and background expression frequently, it is unrealistic to assume naively that the dialogue system recognizes and accepts only query forms. Take the following ex-cerpt from the novel Harry Potter and the Sorcerer’s Stone for example. One day Harry Potter said to Hagrid:

(5) Everyone thinks I’m special, . . . but I don’t know anything

about magic at all. How can they expect great things? I’m famous and I can’t even remember what I’m famous for. . . .

Assume for now that Hagrid is a computer. If Hagrid fails to distinguish between the two, it can never understand what Harry means by “great things” and then work out a search plan accordingly to try to comfort Harry by saying “Don’ you worry, Harry. You’ll learn fast enough.”

2One may think that the QA system has a chance to function well if we force users to rephrase their query as “Where did I go wrong when I’ve installed and configured the Wine but it cannot find MS Windows on my drive?”. It may work, but is neither practical nor user-friendly.

1.2.3 Computer-computer Communication

As for computer-computer communication, intelligent agents or software robots may need to travel around the Internet and along the way gather information on behalf of their users. Since XML and semantic webs are still young and there is no universally accepted semantic markup language for unrestricted domains, un-structured documents still dominate the Web. Therefore, a better understanding of speech acts in general and questions in particular may help software analyze unstructured documents and transform them into structured ones.

Furthermore, in multi-agent systems agent communication languages are based mostly on speech act theory (e.g., KQML defines a set of performatives for agents to communicate with [2, 29]) and temporal or first-order predicate logic (e.g., KIF [24]). Many information systems for intra- or inter-business process have also been modeled from the language/action perspective (LAP; see [56] for an overview of LAP and [16, 32] for typical applications). The study of question in natural lan-guage settings may help to enhance the expressiveness of communication facilities, finer-grained mental states, and belief-desire model of these systems.

1.2.4 Punctuation Processing

As for punctuation processing, any NLP system is not complete without punc-tuation processing, but puncpunc-tuation has been neglected in the NLP field. For example, speech-to-text recognition software maps acoustic signals to text, but it seldom places appropriate punctuation marks in the output text. Word processors have built-in or plug-in spelling and grammar checkers, but they seldom try to check punctuation.

Some literature did recognize the importance of punctuation more or less, as we have seen in Section 1.1. However, it treats the punctuation as a given cue, and does not discuss what if the cue is absent at all.

The reason why punctuation has been neglected is that, it is such a complex

coding device that challenges computers. It is, as defined in The American Her-itage Dictionary [47], “the use of standard marks and signs in writing and printing to separate words into sentences, clauses, and phrases in order to clarify meaning.”

Therefore, to assign punctuation correctly involves not only syntactic but also se-mantic and pragmatic levels of processing. Take the following English sentences for example,

(6) a. Is this yours?

b. What is it?

c. I beg your pardon?

d. This is yours? I don’t think so.

To punctuate them correctly with question marks, one has to judge whether they are questions. Sentence (6a) is obviously a question because of its verb BE-initial syntactic pattern; the same for Sentence (6b) because of its WH word-initial fol-lowed by a verb BE syntactic pattern. Sentence (6c), which begins without a verb BE, an auxiliary verb, or a WH word, is regarded as a question only if the lexi-cal meaning of the word “pardon” is taken into account. Furthermore, Sentence (6d) is regarded as a question only if the pragmatic context is taken into account.

Therefore, to be perfect, it is very complicated in general.

在文檔中漢語問句偵測之量化研究 (頁 17-22)