• 沒有找到結果。

Preferred Argument Structure For Discourse Understanding

N/A
N/A
Protected

Academic year: 2021

Share "Preferred Argument Structure For Discourse Understanding"

Copied!
5
0
0

加載中.... (立即查看全文)

全文

(1)

PREFERRED ARGUMENT STRUCTURE FOR DISCOURSE UNDERSTANDING KA-WAI CHUI

Matsushita Electric Institute of Technology (Taipei) Co., Ltd.

A b s t r a c t

The main purpose o f c o m m u n i c a t i o n is to e x c h a n g e i n f o r m a t i o n . A n y d i s c o u r s e understanding model should be able to process the flow o f information throughout the entire text. According to Du Bois (1987)'s studies o f information flow in discourse across a number o f languages, information distribution among argument positions in clauses is by no means random, but cemdn grammatical patterns tend to recur consistently. He thus formulated a Preferred Argument Structure (PAS) for the p r e f e r e n t i a l structural c o n f i g u r a t i o n s o f arguments. In our examination o f Chinese narrative discourse, the language also displays PAS, yet the Chinese PAS challenges tim universality o f the one Du Bois proposed. Based on the quantity and distribution of lexical a r g u m e n t s a n d n e w r e f e r e n t s a c r o s s grammatical roles in discourse, it is realized that Chinese PAS also maintains one new argument at most within a basic information processiug unit. Since new referents in Chinese have to be encoded in full NP form, it is thus less likely to have more than one lexical argument within a clause. Moreover, this single new argument appears preferentially in the O role, rather than the A and S roles Du Bois's PAS formulates.

Since the structure of information flow has a corresponding grammatical patterning, both grammatical and pragmatic processing can be c a r r i e d out s i m u l t a n e o u s l y , in that the i n f o r m a t i o n status o f an argument can be identified by virtue o f grammatical analysis. Althougll P A S is n e i t h e r u n i v e r s a l nor categorical, it can function in a discourse understanding model as heuristic device to process the information structure o f a connected spoken discourse.

According to Du Bois (1987)'s studies o f information flow in discourse across a number o f languages, information distribution among argument positions is neither arbitrary nor random, but certain grammatical patterns are preferred over others, especially they tend to recur c o n s i s t e n t l y in a c o n n e c t e d s p o k e n discourse, hi other words, the structure o f i n f o r m a t i o n f l o w has a c o r r e s p o n d i n g g r a m m a t i c a l patterning. T h o s e r e c u r r e n t patterns, which indeed reflect speakers' actual language use, are formulated as Preferred Argument Structure (PAS) T h e P A S he formulated comprises the following constraints: One Lexical Argument Constraint to avoid more than one referent in full NP form per clause, Non-Lexical A Constraint to have the single lexical referent not appearing in the A role, One New Argument Constraint to avoid more than one referent carrying new i n f o r m a t i o n per clause, Given A Constraint to have the new referent not appeariug in the A role. However, in examining Chinese narrative discourse, it is discovered that the PAS that this particular type o f discourse genre displays c h a l l e n g e s the universality o f Du Bois's. The idiosyncrasy o f the Ctlinese PAS will be d i s c u s s e d in this paper.

in fact, from the computational point o f view, no matter it is universal or language- specific, the existence o f P A S has significant implication to discourse understanding. On the one band, it enables grammatical and pragmatic processing being carried out simultaneously because the information status o f a referent can be identified by virtue o f grammatical analysis; on the other hand, PAS can function as heuristic device to process the information structure o f a connected discourse.

1 . Introduction

The main purpose o f c o m m u n i c a t i o n is to exchange intbrmation. On the part of a speaker, he may employ various strategies to organize the information he intends to couvey, in that some bear old information while others carry n e w information. Therefore, a discourse understanding model should be able to process the flow o f information throughout the entire text. In this paper, our issue is tbcused on the r e f e r r i n g a r g u m e n t s in C h i n e s e narrative discourse, and our main concern is how they are structured in relation to infom~ation flow.

2 . Preferred Argument Structure in Chinese Narrative Discourse

Unlike the languages Du Bois has studied (1987), Mandarin Chinese is a typologically different language with no inflection and relatively free word order. Nevertheless, it still exlfibits its own idiosyncratic PAS in spoken narrative discourse. The corpus for the present study comprises eight oral narratives as told by eight Mandarin native speakers o f 20-25 years old. They were requested to describe the story about thc p o p u l a r m o v i e G h o s t to the interviewer in a speech laboratory. It portrayed

(2)

a young man who was killed accidentally in a robbery, and who tried to protect his girlfiiend from file nmrderers and to take revenge on them in form of a spirit. The uarratives were taped for later transcription.

To study the Chinese PAS, our examination is focused on the issues of quantity and role iu distributing lexical arguments and new referents across grammatical positions at both the grammatical and pragmatic dimensions. 2.1 P r e l i m i n a r i e s for Analysis

Segmentation of the 120 minutes long narratives was subject to intonation unit being identified by eat' as at stretch of speech uttered under a single coherent intonation contour and typically bounded by a pause. Chafe (1987) hlts hypothesized that intonation units representing l i n g u i s t i c e x p r e s s i o n s o f f o c u s e s o f consciousness are independent processing units typical of spoken discourse. In the present corpus, there were a total of 1433 intonation units, with a mean length of 6.69 words. The fact that the clause being defined as a verb and its arguments, and the intonation unit often coincide (Du Bois, 1987; Chafe, 1987, 1988) was further confirmed in this study, sittce 85.28% (1222) intonation units contained clausal elements. Those units comt~rised false starts, repetitions, filled pauses, as well as clause fragments such as conjunctions, adverbials, and particles would be excluded from further analysis. Therefore, tile study of Chinese PAS is indeed based on chmses. Following is a sample of five clausal intonation units (a-e) produced by a female speaker:

(1) a. jiouyitian ta genzhe ta nupengyou one day he follow his girlfriend 'One day, he followed his girlfriend.' b. zai ta nupengyou ji~di deshihou

in bis girlfriend home when 'When (he was) in his gMfriend's tlome,' c. la nupengyou zcd huan yifu

his girlfriend PROG change clothes 'his girlfriend was changing clothes.' d. ranhou you yi ge huztiren then there-be one CI~ bad guy 'Then, there was a bad guy.'

e. huairen chuang finial bad guy break in 'The bad guy broke in.'

Within a single clause, the molphological type of each referent, its gramnlatical role, as well as information status were all recorded. The morphological type of an overt referent in Chinese was either a lexical NP or a pronoun,

whose surface grammatical role would be classified as A (transitive subject), S (intransitive subject), O (transitive object), or Oblique (object of a preposition). Furthermore, Chafe (1987)'s three-way distinction of information tbr referents was adopted, mainly because his categories lay their foundation on the actual cognitive processing of information transfer by language users. They were given information, accessible information, and n e w information. A given referent referred to the entity mentioned previously, while a n e w refi~rent was the one that had uot yet been brought up in the prior context. Internlediate between these two was accessible information, either coming from the expectations associated with a schema or resulting ti~om deactivation from an earlier state. Following Du Bois (1987), a referent constituted by deactivation should be at least twenty propositions away from its most recent appeluance operationally. 2.2 The G r a m m a t i c a l Dimension o f PAS The purpose of studying PAS at the grammatical dimension is to examine whether there is a prefen'ed surface configuration of arguments in the observed data. Therefore, we investigate both the number of lexical (NP) argmnents and their distribution across the granunatical roles in clauses.

According to our tabulation shown in Table I, of the 1127 clauses (excluding the equational type), those with zero or one lexical argument are the most common structure which constitute a distinct majority (94.15%).

Table 1. Frequency of clauses with 0, 1, and 2 lexical argmnents

frequency percentage 0 lex arg 587 52.09 1 lex arg 474 42.06 2 lex arg 66 5.85 Total 1127 100 ( X2.99 (2) = 399.89 )

Since only transitive verbs can take more than one argument, it is necessary to seperate them from the intransitive ones for tabulation, in c a ~ the rarity of two-lexical-argument structures is simply due to tile rmity of transitive clauses. The result in "Fable 2 shows clearly that even in transitive constructions, two-lexical-argument structures are still a minority (9.17%). ]'he result indeed supports Du Bois's One Lexical Arguanent Cot~'traint in that "there is a tendency for speakers to avoid more than one lexical argument per clause" ( p.819).

¢

(3)

Table 2. The frequency of lexical arguments in transitive and intransitive clauses

Transitive

f~a/%

0 lex arg 321 44.65 1 lex arg 332 46.18 21exarg 66 9.17 ~ - - Total 719 100 ( x Intransitive To~l freq % freq % 266 65.2 587' 52.09 142 34.8 47442.06 66 5.85 408 100 1127 1~1 99(2) = 6 6 . 5 6

Since speakers incline to use one lexical argument at most in a single clause, it is necessary to study whether this single lexical referent is randomly distributed across the grammatical roles. According to our tabulation as shown in Table 3, it is realized that O (84.3%) and Oblique (92.31%) each contain an overwhelming proportion of lexical arguments, whereas A and S contain a smaller portion of them.

Table 3. Grammatical roles and morpholo- gical types of arguments

le~c~ n '% A 155 38.08 S 132 55.46 O 306 84.3. OBL 192 92.31 Total 785 64.56 pronominal Total n % n 252 61.92 407 "106 44.54 238 57 15.7 363 16 7.69 208 431 35.44 1216 ( X 2 9 9 (3) = 2 6 5 . 0 9 )

Since 64.56% of all referents are lexical, if they are randomly distributed across the grammatical positions, 38.98% of them will appear in the O role, while the A and S roles are restricted to include lexical referents, as indicated by Table 4.

Table 4. Distribution of lexical arguments across grammatical roles

frtxluency perceutage A S 0

Obi

Totai , , m 155 "'i9.75 132 16.82 306 38.98 192 24.45 785 I(X} ( X299. (3) = 91.17)

Unlike Du Bois's Non-Lexical A Constraint to avoid lexical referents appearing in the A position, Chinese speakers would not prefer the A and S roles to mention a referent lexically. It is the position O (or Oblique) that preferentially favors lexical arguments. The Lexical 0 Constraint is thus proposed to characterize this particular phenomenon in Chinese narrative discourse. In short, the One Lexical Argument Constraint and the Lexical 0 Constraint, which are indeed the constraints on quantity and role respectively, constitute the Chinese PAS at the grammatical dimension. The quantity of lexical argument within a clause is usually one at most, and this single argument preferentially appears in the O role. Although they are not categorical rules, they do represent a statistically significant tendency of actual language use.

2.3 The Pragmatic Dimension of PAS

In the preceding section, it has been shown that in narrative discourse different argument p o s i t i o n s o f a c l a u s e h a v e d i s t i n c t morphological preferences. This section aims at studying the pragmatic dimension of PAS by examining the quantity of new arguments, as well as their distribution across the grammatical roles.

Firstly, it is found that transitive and intransitive clauses either contain zero or one new referent, with the former predominating ( 8 1 . 0 6 % ) , as i n d i c a t e d by T a b l e 5. Significantly, not a single clause contains two new referents. The result supports Du Bois's One New Argument Constraint to "avoid more than one new argument per clause" (p.826).

Table 5. The frequency of new arguments in transitive and intransitive clauses

Transitive lntl-'ansitive Total freq % freq % freq % 3 new arg 442 77.27 217 90.04 659 81.06

new arg 13 22.73 24 9.96 154 i8.94 Total 572 100 241 100 813 100

( ) 199(1) = 1 8 . 0 1 )

To understand whether the single new referent is randomly distributed across A, S, O, and Oblique, it is necessary to examine the distribution of information across these positions. As indicated in Table 6, a substantial proportion of A and S carry old information, and new referents preferentially occur in O and Oblique.

(4)

Table 6. Grammatical roles and information status o f argutnent

new accessible gwen fotal

n % n % n % n A 12 2.95 10 2.46 385 94.59 407 S 24 10.08 11 4.62 203 85.3 238 O 122 33.61 34 9.37 207 57.02 363 0BLI 81 38.94 2 5 12.02 102 49.04 208 Total I 239' 19.65 80 6.58 897 73.77 1216 ( X299 (6) = 229.02)

O f 239 new referents found in the corpus, a large portion occur in the O role (51.05%) as shown in Table 7, while only a small portion a p p e a r in t h e A and S r o l e s w h i c h overwhelmingly convey old information. Since Chinese speakers disfaw)r both the A and S roles to mention a n e w referent, Du Bois's Given A Constraint, which "avoids introducing a new referent in the A-role argument position" (p.827), is inappropriate to Chinese narrative discourse. The New 0 Constraint is then proposed for Chinese to characterize the free occurrence o f n e w referents in the O role, as well as the high restriction in the A and S roles.

Tahte 7. Distribution o f new arguments across gramnmtical roles

A O I Obl Tot',d frequency percentage 12 5.02 24 10.(14 122 51.05 81 33.89 239 ~" I(X) ( X2.99 (3) = 131.96)

Comparing the frequency distribution o f A and S, it is even rare for A to code new referents. This can be explained by the fact that Chinese includes a type o f presentative construction which "performs the function o f introducing into a discourse a norm phrase naming an entity" (Li & Thompson, 1981). Verbs o f this sentence type are usually intransitive, and the f o l l o w i n g a r g u m e n t s u s u a l l y carry n e w information. Since speakers do not necessarily use presentative constructions to introduce a n e w entity, they merely constitute a minority (20 clauses) in our corpus, as exemplified in (2) and (3).

(2) turan you yi ge huairen paocludaiqiangjie suddently exist one CL bad guy run out mb 'Suddently, there is a bad guy mnning out to rob.' (3) fie shang you yi ge zhaopai ya

street on exist one CL signboard PART 'On the street, there is a signboard.'

In short, the One New Argument Constraint and the N e w 0 Constraint constitute the C h i n e s e P A S at the p r a g m a t i c d i m e n s i o n . There is a strong tendency in discourse to limit the number o f new ~ugument in a clause to a m a x i m u m o f one. This single n e w referent tends to be introduced in the O (or Oblique) role and the second occurrences preferentially appear in the A and S roles. It is o f course this preponderance o f old information found in the {A, S} a l i g n m e n t that g i v e s C h i n e s e the distinction o f being a topic-prominent language. 2.4 C o r r e l a t i o n o f P A S b e t w e e n Gram-

matical and Pragmatic Dimensions

We have already studied the quantity and the role constraints that constitute PAS for Chinese narrative discourse at both the grammatical and pragmatic dimensions. The correlation o f PAS between these two dimensions is so strong that the g r a m m a t i c a l One Lexical Argument Constraint and Lexical 0 Constraint are parallel to the pragmatic One New Argument Constraint and New 0 Cor~traint respectively, as shown in Table 8. In other words, the most preferred structure is to have one new argument at most within a single clause. Since new referents in Chinese have to be coded in full NP form, it is thus less likely to include more than one lexical argument within one discourse unit. Moreover, there is a strong tendency for the single n e w argument to appear in the O role, so that the lexical referent typically appear in this particular position. The flow o f information does have a corresponding grammatical patterning.

Table 8. Dimensions and constndnts o f Chinese PAS

Grammar Pragmafics

Quantity 9he Lexical Argu- One New Argu- ~nent CorL~'traint ment Constraint

Lexical 0 New 0

Role C o r t ~ t r a i n t Constraint Comparing the Chinese PAS with the one Du Bois p r o p o s e d for the languages he has studied such as Sacapultec Maya, as shown in Table 9, it is o b v i o u s that Du Bois's P A S cannot completely be generalized to Chinese, at least the narrative discourse genre is concerned. Their difference lies in the distribution o f lexical ACRES OE COLING-92, NANII~S, 23-28 AOI]T 1992 1145 PROC. OF COLING-92, NANTES, Ant;. 23-28, 1992

(5)

a r g u m e n t s and new r e f e r e n t s across grammatical roles. As the PAS in Sacapultec avoids mentioning new lexical arguments in the A role, Chinese speakers disfavor both the A and S roles and strongly prefer O.

Table 9. Dimensions and constraints of PAS in Chinese and Sacapultec

Grammar

Chinese Sacapultec Quantity One Lexical~4rgume'ntConstraint

Role Lexical 0 Non-Lexical A Constraint Constraint

ii

Pragmatics

Chinese Sacapultec "Quantity One New'A'rgurnent Constraint

New 0 Given A

Role Constraint Constraint

3 . Implication of PAS to Discourse Understanding

Chinese, like a number of other languages whose pattern of information flow in spoken narrative discourse has been investigated to date, also exhibits PAS. This suggests that there is a strong discourse pressure driving the various grammatical patterning in different languages, so that the universality of the PAS Du Bois proposed encounters challenge. However, from the computational point of view, no matter whether PAS is universal or language-specific, its existence has significant implication to discourse understanding, in that the flow of information throughout a connected discourse is highly structured with a corresponding grammatical patterning as far as quantity and role are concerned. Therefore, it is possible to identify the information status of an argument by virtue of grammatical analysis, so that both grammatical and pragmatic processing can be carried out simultaneously. Even though PAS is not categorical in nature, a discourse understanding model can still use it as heuristic device to process the information structure of a connected spoken discourse.

In short, a discourse understanding model employing PAS for information processing should take the following points into consideration:

(a) Clauses are the basic information processing units.

(b) Transitive and intransitive clauses should be seperated for analysis.

(c) The morphological type, grammatical role, and information status should he recorded for each argument position.

(d) The quantity and role constraints are the heuristic p r i n c i p l e s for i n f o r m a t i o n processing.

4. C o n c l u s i o n

hi this paper, we have demonstrated that Chinese narrative discourse also displays Preferred Argument Structure based on the quantity and distribution of lexieal arguments and new referents across grammatical roles. However, the Chinese PAS challenges the universality of the one Du Bois proposed, because they differ in the distribution of lexical a r g u m e n t s a n d new r e f e r e n t s across grammatical roles. In other words, the discourse pressure d r i v i n g the various grammatical patterning in different languages reflects the underlying pragmatic preference of the different groups of language users.

From the computational viewpoint, no matter whether PAS is universal or language- specific, its existence has significant implication to discourse understanding. On the one hand, PAS can function in a discourse understanding model as a heuristic device to process the information structure of a connected spoken discourse; on the other hand, the information status of an argument can be identified by virtue of grammatical analysis since the flow of information has a corresponding grammatical patterning,

References

Chafe, Wallace L. 1987. Cognitive Constraints on Information Flow. In Russells S. Tomlin (ed.), C o h e r e n c e and G r o u n d i n g in Discourse. Amsterdam: Benjamins. Chafe, Wallace L. 1988. Linking Intonation

Units in Spoken English. In Haiman & Thompson (eds.), Clause Combining in Dis- course and Grammar.Amsterdam: Benjamins. Du Bois, John W. 1987. The Discourse Basis

of Ergativity. Language 63, 805~855. Givon, Tahny. 1983. (ed.). Topic Continuity in

Discourse: a Quantitative Cross-language Study. Armsterdam: Benjamins.

Li, Charles, & Thompson, Sandra. 1981. Mandaring Chinese: A Functional Reference G r a m m a r . California: U n i v e r s i t y of California Press.

Givon, Talmy. 1983. (ed.). Topic Continuity in Discourse: A Quantitative Cross-Language Stduy. Amsterdam: Benjamins.

數據

Table  1.  Frequency of clauses with 0,  1,  and 2 lexical argmnents
Table 2.  The frequency of lexical arguments  in transitive and intransitive clauses
Table  8.  Dimensions and constndnts o f   Chinese PAS
Table 9.  Dimensions and constraints of PAS  in Chinese and Sacapultec

參考文獻

相關文件

understanding of what students know, understand, and can do with their knowledge as a result of their educational experiences; the process culminates when assessment results are

3: Calculated ratio of dynamic structure factor S(k, ω) to static structure factor S(k) for "-Ge at T = 1250K for several values of k, plotted as a function of ω, calculated

Indeed, in our example the positive effect from higher term structure of credit default swap spreads on the mean numbers of defaults can be offset by a negative effect from

• Uses a nested structure to accumulate path data as the simulation is running. • Uses a multiple branch structure to choose the

In the work of Qian and Sejnowski a window of 13 secondary structure predictions is used as input to a fully connected structure-structure network with 40 hidden units.. Thus,

✓learning contextualized word embeddings specifically for spoken language. ✓achieves better performance on spoken language

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2005..

The remaining positions contain //the rest of the original array elements //the rest of the original array elements.