Chapter 3 Learning Question to Query Transformation
3.2 Question Analysis
3.2.3 Linguistic Analysis for Question Pattern Extraction
After analyzing recurring patterns and regularity in quizzes on the Web, we designed a simple procedure to recognize question patterns. We present this procedure as a small set of prioritized rules (see Figure 3).
Figure 3: Rules used to identify the question pattern in a given question
First, we identify the question word which is one of the wh-words (“who,” “what,”
(Rule 1) Question word in a chunk of length more than one (e.g. “which female singer”)
(Rule 2) Question word followed by a light verb and NP chunk (e.g. “who made flight”)
(Rule 3) Question word followed immediately by a verb (e.g. “who painted”) (Rule 4) Question word followed immediately by a passive VP or an NP (e.g.
“what is called”)
(Rule 5) Question word followed by the copulate “to be” and an NP (e.g.
“what is the river”)
“when,” “where,” “how, ” or “why”) tagged as determiner or adverbial question word (i.e.,
“wdt,” “wql,” and “wrb”). According to the result of POS tagging and phrase chunking, we further decide the main verb and the voice of the question. Then, we proceed to apply the following expanded rules to extract words to form question patterns:
Rule 1.a If the question word is tagged with “wdt” and it is in a NP chunk of length greater than one, its question pattern will contain the question word and the headword of the chunk.
Rule 1.b If the question word is tagged with “wql” and it is in a NP chunk of length greater than one, its corresponding question pattern will contain the question word and the following adjective (“jj” and “ap”).
For instance, consider the following Examples (2) to (4):
(2) Which female singer performed the first song on Top of the Pops?
POS: which/wdt femle/jj singer/nn performed/vbd the/at first/cd song/nn on/in top/nn of/in the/at pops/nns ?/?
Chunk: which femle singer/NP performed/VP the first song/NP on/PP top/NP of/PP the pops/PP ?/O (3) How many American states begin with the letter “M”?
POS: how/wql many/jj American/jj states/nns begin/vb with/in the/at letter/nn “/“ M/nn ”/” ?/?
Chunk: how many American states/NP begin/VP with/PP the letter/NP “/O M/NP ”/O ?/O (4) In what year was Hong Kong returned to China?
POS: in/in what/wdt year/nn was/bed Hong/np Kong/np returned/vbd to/to China/np ?/?
Chunk: in/pp what year/NP was/VP Hong Kong/NP returned/VP to/PP China/NP ?/O
After we apply Rule 1.a to Example (2), the question word “who” and the headword “singer”
in the same NP chunk will be chosen to form the question pattern. Consider another question
in Example (3). Rule 1.b applies and the question pattern is the question word plus an adjective, “how many.” The question in Example (4) is handled similarly.
Rule 2 If the question word is a chunk by itself and the main verb is a light verb (i.e., have, do, know, think, get, go, say, see, come, make, take, look, give, find, use), then the question pattern is composed of the question word, the light verb, and the head of the first NP or PP chunk following the light verb.
By applying Rule 2 to Example (5), it question pattern will be “who made flight.”
(5) Who in 1961 made the first space flight?
POS: who/wps in/in 1961/cd made/vbd the/at first/od space/nn flight/nn ?/?
Chunk: who/NP in/PP 1961/NP made/VP the first space flight/NP ?/O
Rule 3 If the question word is a chunk by itself followed by a VP or NP chunk without a light verb, the question pattern is the question word and the head word of the VP.
By applying Rule 3 to Example (6), it question pattern will be “who painted.”
(6) Who painted “The Laughing Cav alier”?
POS: who/wps painted/vbd “/“ the/at Laughing/vbg Cavalier/nn ”/” ?/?
Chunk: who/NP painted/VP “/O the laughing cavalier./NP ”/O ?/O
Rule 4 If the question word is in a chunk by itself and the question is in passive voice, the question pattern will contain the question word, “to be,” and the headword of the passive VP.
Applying Rule 4 to the following Example (7) and (8), we will get question patterns “what is called” and “what is known” respectively.
(7) What is a group of geese called?
POS: What/wdt is/vbz a group/np of/in geese/nns called/vbn ?/?
Chunk: what/NP is/VP a group/NP of/PP geese/NP called/VP ?/O (8) In Bible, what is known as the Decalogue?
POS: in/in Bible/np ,/, what/wdt is/vbz known/vbn the/at Decalogue/np?/?
Chunk: in/PP Bible/NP ,/O what/NP is known/VP the Decalogue/NP ?/O
Rule 5 If the question word is in a chunk by itself follow by a “to be” chunk and an NP chunk, the question pattern is the question word and the headword of the first NP.
Appling Rule 5 to Example (9), we will get a question pattern “what river”
(9) What is the second longest river in the world?
POS: What/wdt is/vbz the/at second/od longest/jjt river/nn in/in the/at world/nn ?/?
Chunk: what/NP is/VP the second longest river/NP in/PP the world/NP ?/O
Finally, we have the last rule to hand all the other cases:
Rule 6 If none of the above rules are applicable, the question pattern will contain the question word only.
It is noticed that the heuristic rules (as 1~6) are intuitive. Moreover, the generated and recurring patterns suggest generality of the patterns and the feasibility of gathering training data to learn the terms that co-occur with the answers. These question patterns also indicate a
preference for the answer to belong to a fine-grained type of proper nouns as observed by Mann (2002a) (see Table 3). In the next section, we describe how we exploit these patterns to learn how to carry out effective query expansion.
Table 3: Question patterns suggest preference to fine-grained type of proper noun.
Questions Question Pattern type of anwers
Which rock ‘n’ roll musician which- musician musician
Which singer … which-singer singer (musician)
Who sang … who-sang singer (musician)
Who’s the lead singer which-singer singer (musician) What female Disco singer what-singer singer (musician) What helicopter pilot what-pilot pilot
Who made flight who-made-flight pilot
Which astronaut what-astronaut astronaut (pilot) What Russian astronaut what-astronaut astronaut (pilot)
Who is the author who-author author
Who wrote who-wrote author
What car company what-company company
What Hollywood studio what-studio studio (company)
Questions Question Pattern type of anwers
Which rock ‘n’ roll musician which- musician musician
Which singer … which-singer singer (musician)
Who sang … who-sang singer (musician)
Who’s the lead singer which-singer singer (musician) What female Disco singer what-singer singer (musician) What helicopter pilot what-pilot pilot
Who made flight who-made-flight pilot
Which astronaut what-astronaut astronaut (pilot) What Russian astronaut what-astronaut astronaut (pilot)
Who is the author who-author author
Who wrote who-wrote author
What car company what-company company
What Hollywood studio what-studio studio (company)