Verification of the Ambiguous Words - Pretest Three—Familiarity Rating Task for Ambiguous Words

4.3 Pretest Three—Familiarity Rating Task for Ambiguous Words

4.3.5 Verification of the Ambiguous Words

To keep the experimental conditions for the two groups (homonymous words and

polysemous words) equated and to avoid other potential confounding factors, these

two groups were also controlled for a number of variables (in addition to word

frequency and experiential familiarity), including percentage of verb usage, meaning

frequency distribution, percentage among confounding homophones, number of

senses, number of syntactic categories, phonological neighborhood density, and

number of participant roles. The homonymy group and polysemy group were verified

to have no significant differences with regard to these variables so that the

experimental results would not be confounded. Through the verification process, the

potential outliers were screened out and thus 16 homonyms and 16 polysemes were

selected as the finalized stimuli (as primes). The verification of the potential

confounding factors was listed in the following:

(1) Percentage of verb usage. We compared how often the two groups of

ambiguous words were used as verbs in order to avoid cross-categorical problem, and

to ascertain that the two groups did not differ regarding the percentage of verb usage

[t (30) = .684, p = .249].

(2) Meaning frequency distribution (Ranking percentage of the primary and

secondary meaning). According to the data collected from the sense ranking task, we

compared the strength of the two meanings of each ambiguous word in the two groups,

in order to make sure that the selected ambiguity was unbalanced or polarized in

meaning distribution. That is, the frequency of the primary meaning of ambiguity

should be greater than the secondary meaning [t (30) = 4.946, p =.000 for homonymy

group; t (30) = 3.359, p = .003 for polysemy group]. The two groups were also

compared and they have no differences in terms of the ranking percentage of the

primary and secondary meaning of ambiguity [primary meaning: t (30) = .898, p

= .188; secondary meaning: t (30) = -.578, p = .284]. In addition, the differences on

the ranking percentage between the primary and secondary meaning of the

homonymy group did not differ from that of the polysemy group [t (30) = 1.138, p

= .264].

(3) Influence of confounding homophones. Since the stimulus words would be

presented auditorily in a cross-modal experiment, it would be necessary to avoid the

possibility that the stimulus words might be confounded by their homophones. The

frequency of each homophone word (i.e., the total frequency of those words with the

same sound) was counted (according to the data from Sinica Corpus 5.0 retrieved by

Chinese Wordsketch), and the proportion of the character frequency of each stimulus

word among the frequency of its all homophones was thus calculated. Those words

with higher percentage indicated that they had less possibility of being confounded by

other homophones, and those words with lower percentage indicated that they had

higher possibility of being confounded by other homophones. For example, the

frequency of the prime word “送” was 1363 (based on Sinica Corpus 5.0) and its

homophones song4 included “送”, “宋”, “頌”, “訟” and “誦”. The total homophone

frequency of these words was 1801. The frequency percentage of the word “送”

among its confounding homophones was thus calculated as 1363/1801(%) = 75.68

(%). By comparing the percentage of the two ambiguity groups, influence of

confounding homophones was controlled so as to make sure that the two groups were

not significantly different in terms of their confounding homophones [t (30) = -.194, p

= .424].

(4) Number of meanings/senses. In order to avoid NOM (number-of-

meaning)/NOS (number-of-sense) effect pointed out by Borowsky and Masson (1996),

Lin (1999), Pexman and Lupker (1999), and Piercey and Joordens (2000), which

specifies that different numbers of senses would influence lexical processing, the

number of senses of each word in both two ambiguity groups was thus counted (based

on Chinese Wordnet and MOE Revised Chinese Dictionary) and compared in order to

make sure that the two groups did not significantly differ in terms of their numbers of

senses [t (30) = .164, p = .436].

(5) Number of syntactic categories. To remove the influence of NOC

(number-of-category) effect (Huang & Chang, 2004; Huang et al., 2002; Tsai, 2005),

which denotes that different numbers of parts of speech of the words would have

different effects on lexical access, the number of syntactic categories of each word in

both two ambiguity groups was counted (according to Chinese Wordnet and MOE

Revised Chinese Dictionary) and compared in order to ascertain that the two groups

did not differ in terms of their numbers of syntactic categories [t (30) = .745, p

= .231].

(6) Phonological neighborhood density (size). Neighborhood density refers to the

number of word representations that sound like a given word. Words with few similar

sounding words or neighbors (with a sparse neighborhood) and those with many

similar sounding neighbors (with a dense neighborhood) may produce significantly

different effects in word recognition (Luce & Pisoni 1998; Vitevitch & Rodríguez,

2005; Vitevitch & Stamer, 2006). In Chinese, phonological neighborhood density of a

word is defined as the number of disyllabic (two-character) words sharing the same

sound of the initial constituent character (Tsai, Lee, Lin, Tzeng, & Hung, 2006). For

example, all the disyllabic words such as j4izhe3 ‘reporter’ (記者), ji4yi4 ‘memory’

( 記憶 ), ji4hua4 ‘plan’ ( 計畫 ), ji4nian4 ‘memorial’( 紀念 ) in Chinese are the

neighborhoods of the word ji4. We counted each stimulus word and compared the two

groups in regard to the phonological neighborhood density of each word, based on the

data from the system SouCiXunZi ‘Search for Word and Character’ (搜詞尋字),¹⁶ in

order to validate that the two groups did not significantly differ [t (30) = .657, p

= .258].

(7) The argument structure of verbs. It has been indicated by Li, Shu, Liu, and Li

(2006) that the information of the verb’s arguments is an integral part of the mental

representation of verbs, and such information of the verb is accessed on-line during

sentence processing. Similarly, Ahrens & Swinney (1995) and Ahrens (2003)

suggested that the number of participant roles (or thematic roles) associated with the

16 “搜詞尋字” is an on-line retrieval system (http://words.sinica.edu.tw/), which is conducted by Institute of Linguistics, Academia Sinica. This system consists of five-million-word corpus for users

central sense of the verb is crucial information for lexical access in sentence

processing. By using a cross-modal lexical decision task, their findings demonstrated

that reaction times following verbs with three participant roles were longer than those

with one or two participant roles. For example, the two-role verb kick was processed

faster to integrate into the sentence “It was to Robert that the football with a logo was

KICKED” than the three-role verb give in the sentence “It was to Jen that the rabbit

from Mike was GIVEN”. It was thus suggested that the number of participant roles

associated with a verb has influence on the verb’s rate of integration into the sentence

(Ahrens, 2003). As a result, we counted and compared the stimulus words of the two

ambiguity groups regarding to the number of participant roles associated with the

central sense of each stimulus word, in order to make sure that the two groups were

not significantly different in terms of this variable [t (30) = .591, p = .279]. Moreover,

all of the stimulus words were checked (on the basis of Chinese Wordnet and MOE

Revised Chinese Dictionary) to be transitive verbs in order to make the stimulus

words homogeneous.

Therefore, the homonymy group and polysemy group were compared and

confirmed that they did not differ regarding the above seven factors as well as word

frequency [t (30) = -.435, p = .333> .05] and word familiarity [t (30) = .646, p = .202].

In other words, the conditions for these two groups of ambiguity were not

significantly different except for their sense relatedness rating scores [t (30) = -14.048,

p = .000 < .05], which was an important variable manipulated in the present study.

Please refer to Appendix 11 for the items and complete statistical data.

In addition to 16 homonyms and 16 polysemes, a list of 16 unambiguous words

was constructed and added as filler items. The homonymy group, polysemy group and

unambiguity group were also compared and they did not differ with regard to word

frequency [F(2, 45) = .428, p = .654], percentage of verb usage [F(2, 45) = 1.028, p

= .366], influence of homophones [F(2, 45) = .041, p = .96], number of syntactic

categories [F(2, 45) = 1.642, p = .205], number of participant roles [F(2, 45) = 1.047,

p = .36] and neighborhood density [F(2, 45) = 1.576, p = .218].

Table 4.1 summarizes the statistical data of the experimental stimuli with respect

to each of the variables. The complete list of the experimental items (48 words in total)

was offered in Appendix 11.

Table 4.1. The statistical data of the three groups of prime words

Variables Prime groups N Mean SD T-test ANOVA

homophones Unambiguity 16 60.51 40.78

F(2, 45) = .041,

在文檔中詞義相關性在詞彙歧義理解上的效應: 以中文動詞為例 (頁 94-101)