Studies of lexical ambiguity resolution in Chinese

Chapter 2 Literature Review

2.3 Chinese lexical ambiguity resolution

2.3.2 Studies of lexical ambiguity resolution in Chinese

國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

the compositional meaning was a kind of bull, while its idiomatic meaning was a person who scalps. The present study focused on Chinese homophonic homographs.

2.3.2 Studies of lexical ambiguity resolution in Chinese

Both homophones and homographs have been investigated by studies using cross-modal priming. Li et al. (2002) investigated Chinese biased homophones following by two different contexts which were biased for either dominant or subordinate meaning. They found only the dominant meaning of homophone elicited priming effects when the dominant-related visual probe occurred 150 ms before the acoustic offset, but both meanings elicited priming effects when the visual probe occurred at the acoustic offset. The findings is compatible with the reordered access model, indicating that dominant meaning is activated early, and context takes precedence over frequency at the later stage. Moreover, Ahrens (2001) embedded balanced ambiguous verbs in subordinate biased contexts and the data showed that there was a significant priming for the primary- and secondary-related probes compared to their respective controls at the acoustic onset of ambiguous words. And both experimental groups were facilitated (reaction times: related probes < control probes). The author concluded that both meanings are activated even when the context biased for secondary meaning. Therefore, it supported modular access

‧

hypothesis. However, the relative meaning dominance was not supposed to have any effect since the balanced ambiguous words were used. One possibility is that the experimental materials intermixed homonymy with polysemy. The related senses contributed to the facilitation of the contextually inappropriate meaning. Chen (2009) discriminated the biased monosyllabic homonymy and polysemy and incorporated both in the context that biased for the dominant meaning. The results indicated that only the dominant meaning of a homonymous word was activated, instead, both meanings of a polysemous word were activated. The author argued that the processing of homonymy was compatible with selective access model; in contrast, the processing of polysemy was compatible with modular access model. It seems that meaning dominance may not influence the processing of related senses. However, the subordinate-biased contexts are crucial to differentiate the two processing models.

Previous studies of Chinese lexical ambiguity resolution are summarized in Table 4.

Table 4. Chinese studies of lexical ambiguity resolution Factors

Studies

Ambiguity type Meaning

dominance

Experimental

paradigm Supporting models

Ahrens (2001)

Disyllabic homograph

and polysemy Balanced Cross-modal

priming Modular

Li et al.

(2002)

Disyllabic

homophone Biased Cross-modal

priming

‧

國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

Researchers have investigated lexical ambiguity resolution of Chinese in cross-modal priming experiments. However, the studies of lexical ambiguity resolution showed inconsistent results and different theoretical hypotheses were supported. It is obvious that context has an influence on word processing, however, we are still far from reaching consensus on the processing mechanisms. Furthermore, a comprehensive understanding of lexical ambiguity resolution in Chinese is not yet available in the natural situation of sentence reading. The goal of the present study is to examine the role of contextual information in the processing of Chinese two-character homographs and to explore the dynamics of semantic activation and integration. Experiment 1 is analogous to that in Sereno, et al. (2006), manipulating three word types, low-frequency ambiguous word(A), low-frequency unambiguous word (LF), and high-frequency unambiguous word (HF). .LF unambiguous controls are matched to the form frequency of the homographs. Chinese homographs are inherently low-frequency words, therefore, the frequency was close in terms of word form and meaning, Experiment 1 aims to revisit the subordinate bias effect and to test which theoretical account that is more consistent with the empirical data in the course of reading for comprehension. Experiment 2 is designed with the purpose to obtain a clear time course of contextual influence and the activation of word meanings, in

‧

國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

particular, to examine the status of the dominant meaning that underlies the SBE when the subordinate-biased context is given.

‧

國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

Prior to the eye-tracking experiments, the biased homographs, unambiguous words, and these words’ contextual constraints were determined by several norming studies. Four norming tasks of subjective rating were conducted to measure word’s meaning preference, meaning relativeness, contextual predictability, and context biasing. First, in the interpretation preference task, word's meaning dominance was determined by the proportion of the participants’ first interpretation response. The results were used to select the biased homograph and unambiguous control words for experiments. Second, a meaning relatedness task was conducted to make sure that the selected ambiguous words were homographs with two unrelated meanings. Third, a cloze task was conducted to ensure the targets’ predictability values from the leading context were below .5. The last norming task was to determine that the context before target was biased for the subordinate meaning.

3.1 Norming study one: Interpretation Preference Task

This task was designed to determine the dominant and subordinate meanings of Chinese biased homographs.

‧

國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

3.1.1 Participants

Forty undergraduate and graduate students (8 males and 32 females) aged between 18-28 years old (mean age = 21.2) were paid to participate in the interpretation preference task. All of them were native speakers of Mandarin Chinese.

3.1.2 Materials

Fifty-eight disyllabic ambiguous words were selected from現代漢語多義詞詞典(袁暉, 2001) and free association norm of common ambiguous word (Hue et al.,

1996).The meanings of these ambiguous words share either the noun category or verb category (26 NN and 32 VV ambiguous words). Fifty-eight HF and LF unambiguous control words were selected respectively from Academia Sinica balanced corpus.

Ambiguous and unambiguous words were mixed and divided into two lists, each containing twenty-nine ambiguous words, twenty-nine unambiguous HF words, and twenty-nine unambiguous LF words. All the words were presented in a randomized order in each list.

3.1.3 Procedure

The participants were instructed to read the target word for the meaning that firstly came to mind and then were asked to make use of each target to generate a

‧

國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

comprehensive sentence. Specifically, the sentence should contain a preceding disambiguating or supporting context which clearly indicated the specific meaning of the target word. For example,“我聽到了風聲” was an ambiguous sentence, because the preceding context cannot disambiguate the ambiguous word “風聲”. Three examples were given before the task began. The entire questionnaire took about one hour to complete. The meaning preferences of the sentences were used to confirm the word types, namely, ambiguous and unambiguous words, and to determine the relative meaning frequency of ambiguous words.

3.1.4 Results

We classified participants' meaning preference of targets on the basis of dictionary definition in Chinese Wordnet (CWN) (Academia Sinica, 2008) and MOE Revised Chinese Dictionary (教育部國語推行委員會, 1998[2007]). Each ambiguous word were generated at least two difference interpretations. In addition, the corresponding HF and LF control words were all given only one interpretation. The biased ambiguous words were chosen when at least 70% of the subjects gave the same meaning preference. On this basis, forty-six biased ambiguous words met the proportion of meaning dominance for having a dominant meaning, with a mean bias of 90% (range: 80%-100%) and 10% (range: 0%-20%) for the subordinate meaning.

‧

國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

3.2 Norming study two : Meaning Relatedness Task

The task adopted the rating procedure in Rodd et al. (2002) to ensure that the selected ambiguous words were homographs with two unrelated meanings.

3.2.1 Participants

Twenty undergraduate and graduate students (6 males and 14 females) aged between 18-30 years old (mean age = 21.7) were paid to participate in the meaning relatedness task. All of them were native speakers of Mandarin Chinese. None of them had participated in the prior norming study.

3.2.2 Materials

Forty-six biased ambiguous words from the norming study one were used to construct two short sentences. One sentence conformed to the dominant meaning and the other to the subordinate meaning. Therefore, the whole questionnaire consisted of ambiguous words, sentences and together with short definitions of their two meanings.

3.2.3 Procedure

Four lists were created with randomized order and each list was read by five

‧

國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

participants. Participants were given each ambiguous word with short definitions of its two meanings. They were asked to read the meaning definition and sentence first, and then were instructed to rate how related they thought the two meanings described by the sentences were on a 7-point scale (1=not related, 7=much related). Two practices were given before the task began. The entire questionnaire took about 20 minutes to complete. Examples of the questionnaire were provided in Appendix A.

3.2.4 Results

The average relatedness between two meanings of a biased homograph was 2.19.

Twenty-four ambiguous words that had a mean relatedness rating of 1.73 (range = 1.1-2.75) were retained as the homographs with two distinct meanings for the experiments.

3.3 Norming study three : Cloze Task

Contextual constrains which have typically been recognized as predictability of a word from preceding contextual information. Empirical evidence has shown that predictability tends to affect both the location and duration of fixation (K.S. Binder, Pollatsek, & Rayner, 1999), which are considered as the two main components in

‧

國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

readers’ eye movements . Predictability rating is usually conducted via a cloze task to control the predictability scores for homographs, HF and LF control words. For instance, in a cloze task, raters wrote down a word to complete sentence fragments.

This task was conducted to ensure that the preceding sentential context was equally unpredictable to the succeeding target word.

3.3.1 Participants

Forty undergraduate and graduate students (10 males and 30 females) aged between 18-32 years old (mean age = 24.6) were paid to participate in the cloze task.

All of them were native speakers of Mandarin Chinese. None of them had participated in any of prior norming studies.

3.3.2 Materials

For seventy-two target words, we constructed the preceding and succeeding disambiguating context biased for the subordinate interpretation of each homograph, and the sentential context for each unambiguous word. The questionnaire contained seventy-two sentence fragments preceding the targets.

3.3.3 Procedure

‧

國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

There were four lists and the order of sentence fragments in each list was randomized. Participants were presented with the sentence fragments and were asked to write down the next potential word, which came to mind firstly to continue the sentences fragments. Instruction and four practices were given to make sure they know clearly about the whole procedure. The entire questionnaire took about 30 minutes to complete.

3.3.4 Results

The predictability values for target words were determined by the proportion of how many the exact targets were filled in across 20 participants. The predictability values corresponding to homographs, LF words, and HF words were 2.08%, 3.12%, and 4.20% (F < 1). Since participants used words with very similar meaning, the contexts were not predictive but were considered to be supportive for the targets. For instance, the target word “風聲” was generated only by 3 participants when the preceding context was “由於颱風肆虐，外頭傳來猛烈____”. The responses given by other participants (e.g., 巨響, 風吹, 風雨, 颶風, etc.) were semantically congruent with the preceding context.

‧

國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

3.4 Norming study four : Contextual Bias Task

The task was conducted to ensure that 75% or more of native speakers agree that both the preceding and succeeding sentences were biased towards the postulated meaning. Tabossi et al. (1987) suggested that this level of context was considered

“strongly biasing.”

3.4.1 Participants

Ten undergraduate and graduate students (1 male and 9 females) aged between 20-27 years old (mean age = 21.9) were paid to participate in the contextual bias task.

All of them were native speakers of Mandarin Chinese. None of them had participated in any of prior norming studies.

3.4.2 Materials

Twenty-four complete sentential fragments of homographs, determined in previous cloze task, were used in the questionnaire.

3.4.3 Procedure

Prior to the task, instructions, examples and practices were provided to make the participants familiar with the procedure. Initially, participants saw the preceding

‧

國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

context up to the highlighted homographs and then were asked to judge which meaning of the homographs the prior context supported. When the participant selected the appropriate meaning, the succeeding context was presented. They were asked to complete the judgment of the meaning again. Two practices were given before the task began. The entire questionnaire took about 15 minutes to complete. Examples of the questionnaire were provided in Appendix B.

3.4.4 Results

A contextual bias was established by how many participants selected the instantiated meaning in both preceding and succeeding context. The average contextual bias for subordinate meaning was .99, which indicated that 99% of native speakers agree on the intended meaning of the context, thus, the linguistic contexts were strongly biased.

3.4.5 Interim summary

Twenty-four experimental stimuli, including the targets and the sentential contexts respectively, were selected from subjective ratings and met the requirement of the research purpose. Table 5 summarizes and presents the number of participants the rating results in four norming studies. .

‧

國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

Table 5. The norming data summarized from four norming studies

Norming study Participant A(low frequency) LF HF

1.Interpretation

preference 40

92.9%

(dominant-biased) (range = 80%-100%)

100% 100%

2.Meaning

relatedness 20 1.72

(range = 1.1-2.75) _ _ _ _

3.Cloze task 40 0.02

(range = 0-0.25)

0.03 (range = 0-0.4)

0.04 (range = 0-0.5) 4.Contextual

bias 10 99%

(subordinate-biased) _ _ _ _

‧

國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

Experiment One:

The interaction between meaning dominance and linguistic context

The aim of this thesis is threefold. First, we examine the interaction between meaning dominance and contextual information and revisit the subordinate bias effect in Chinese lexical ambiguity resolution. Second, it is without doubt that context facilitates meaning selection of a homograph, but how early could this contextual effect be observed. Finally, we attempt to differentiate the competition and frequency accounts of the subordinate bias effect and, more generally, distinguish between the reordered access and selective access models of lexical ambiguity resolution. The homographs used in this experiment were inherently low frequency in terms of word form and meaning. LF unambiguous control words were thus matched to the word form frequency of the homographs. We predicted that if the dominant meaning was activated, a typical SBE (A > LF) would be found in both the target and the post-target (next two characters of the target) regions, which was consistent with competition account. In contrast, if only the subordinate meaning was activated, the SBE would be eliminated (A = LF) and this supported the frequency account. In addition, HF control words were added to obtain a word frequency effect (e.g. LF >

HF) which can provide alternative evidence to separate two accounts. If the results

‧

國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

supported the competition account, the SBE would be similar to or higher than the observed word frequency effect in the unambiguous case (A-LF > LF-HF); otherwise, the word frequency effect would be higher than the SBE and this result accorded with the frequency account (A-LF < LF-HF).

4.1 Method

4.1.1 Participants

Thirty participants, including 24 females and 6 males were paid to participate in the experiment. Their mean age was 21.5 years old, ranging from 19 to 28 years old.

All participants had normal or correct-to-normal vision and were native speakers of Mandarin Chinese. None of them took part in the previous norming studies.

4.1.2 Materials and Design

There were three types of words in the experiment, LF homograph, LF control, and HF control. Twenty-four biased homographs were used in the present experiment, with a mean bias to dominant meaning for 92% and to subordinate meaning for 7%.

The average word-form frequency obtained from the Academic Sinica Balanced

‧

國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

Corpus (ASBC, 2004), was 6.03 per million for homographs, 7.48per million for LF

words, and 188.77 per million for HF words. Targets were all disyllabic words; in addition, they share the same syntactic category (NN or VV). Each homograph and LF and HF controls matched in word stroke and the neighborhood size of first constituent character (NS1) and syntactic category. The average word stroke for homograph, LF and HF was 19.92, 20.17, and 20.33 and the average NS1 was 40.92, 40.67 and 37. The result of one way analysis of variance on word-form frequency showed a significant main effect across LF and HF conditions [F (2, 69) > 1, p =.00]

and no significant difference between A and LF [F < 1]. The ANOVA on word stroke or on NS1 revealed that there were no significant differences across three conditions [F < 1]. The means of word properties and example sentences are presented in Table 6.

Homographs were embedded in sentences in which preceding and succeeding context were semantically consistent with the subordinate interpretation. Targets were located on the range between the 14thto 16th characters of a sentence; the whole sentence contained 25 to 27 characters. The entire experiment consisted of 104 sentences in total, including 72 experimental sentences, 24 filler sentences and 8 practices. The filler sentences and practice trials were not included for analysis. The experimental and filler sentences were mixed and randomly distributed into three lists.

‧

In each of the lists, the number of each condition was equal, namely, 8 items in each condition. Each sentence spanned one line and was presented in the middle of the PC screen. A participant saw each item only once, and about one-third of the trials were followed by the untimed true-or-false questions, which were designed to ensure that participants read for comprehension. There were four blocks of 24 trials, with block order counterbalanced across subjects, for a total of 96 trials. The experimental sentences are listed in Appendix C.

Table 6. Means of word frequency, strokes, and neighborhood size of first constituent character for the target words on each condition and example of materials used in each condition

Condition Means of Frequency

Note. A= ambiguous words; LF = low-frequency controls; HF = high-frequency controls; Means of Frequency = per million words; the targets were presented with bolds and italics in the example sentences.

4.1.3 Apparatus

Eye movements were recorded with an SR Research EyeLink 1000 Desktop

‧

國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

Mount eye tracking system. Viewing was binocular, and eye movements were recorded from the dominant eye. The eye tracker sampled gaze position every millisecond. Each sentence was presented in black on a grey background and displayed on a single line with up to 27 characters per line. The presentation of character size was 34x34 pixels. Participants were seated 70 cm away from the eyes to the screen, and the width of one character with the space before it equated about one degree of visual angle.

4.1.4 Procedure

When participants arrived for the experiment, they were given a consent form and tested for their dominant eye. Participants were tested individually in a dimly lit and noise-attenuated room. They seated in front of the monitor with their heads in a forehead and chin rest to eliminate head movement during the experiment. At the beginning, the instruction was given to the participants to read the sentences for comprehension without memorizing them on purpose. The five-point or three-point calibration and validation were performed in the first trial of each block (four blocks in total). After the calibration was checked, participants were asked to fixate on a cross, where located at the position of the first character of the sentence. Once they had accurately fixated on the assigned area, the cross disappeared and the sentence

在文檔中詞彙歧義解困的次要語義偏向效應再視：中文多義詞的眼動研究證據 - 政大學術集成 (頁 46-0)