UTTERANCE LENGTH AND THE DEVELOPMENT OF MANDARIN CHINESE
Hintat Cheung National Taiwan University
Abstract
This study examined the application of MLU (Mean Length of Utterance) in Chinese. Three questions were addressed: (1) Is MLU a valid measure for the acquisition of Mandarin? (2) What would be its effective application range ? (3) What would be the counting unit of MLU in Mandarin? In this study, we followed five children (age range: 1;6 to 3;6) for two years and collected 67 one-hour spontaneous speech samples. Two MLU measures were computed, one counted by word, the other by syllable. It is found that both MLU measures correlated with age significantly. Based on the developmental patterns of these children, it is suggested that MLU 3.5 should be its upper limit of application. Besides, MLU in word (MLUw) and MLU in syllable (MLUs) correlates significantly with each other. In conclusion, MLU is found to be a valid measure for assessing preschool children's expressive ability. However, if with older subjects and higher MLU scores, we should be more conservative in interpreting the result.
1. Introduction
MLU (Brown, 1973) is a language index widely used in Northern America for the purposes of screening developmental language disorder and matching children's language ability in experimental studies or in longitudinal reports. Although MLU was aimed at assessing English children's language development, its application has been extended to other languages, such as Irish (Hicky 1991), Spanish (Linares, 1983) and Hebrew (Dromi & Berman, 1982). MLU had also been used as an index on Chinese children's language development (Cheng, 1988; Erbaugh, 1993; Wang, Lillo-Martin, Best & Levitt 1992). However, the use of MLU in these
developmental psycholinguistics studies has two deficits. First, for MLU in English, it has been demonstrated that MLU is no longer a valid measure when its value exceeds a certain level. For example, Brown, (1973) set the upper limit at 4.5 and Scarborough, Rescorla , Tager-Flushberg, Fowler and Sudhalter (1991) suggested a more conservative level 3.0. Yet, the upper bound of MLU in Chinese is not known and thus it poses a question on its scope of application. Second, in these Chinese studies, MLU values were reported but the computing procedures were not mentioned. It is unclear whether these studies employed the same length unit and the same
computing schemes. Without such information, observations from these studies can not be compared or synthesized according to the MLU values. This in fact goes against the purpose of reporting MLU. MLU studies in languages other than English also revealed that MLU computing scheme is a crucial part, as to its validity. Often, a language-specific computing procedure had to be developed. For all these
technical concerns, an evaluation on the validity of MLU and its computing procedures is an urgent need.
To examine the upper bound of MLU, there are two possible methods. One is to collect longitudinal child language data and then compare MLU changes with their respective grammatical development. Brown's (1973) upper bound of MLU is based on such method. The other way is to examine a large pool of speech samples, on a cross-sectional base. MLU is compared with other developmental linguistics index.
Scarborough et al. 's (1991) study represents this approach.
This research employs both methods to examine MLU in Chinese. In Study One, we followed Brown's paradigm and examined five children's spontaneous speech samples, each of them covered a span of twelve months. MLU values is determined by (1) its correlation with chronological age and (2) children's grammatical
development at several MLU levels. In Study Two, we elicited speech samples from eighty four-to-seven-year old children in a story-telling format. Their scores in Oral Expressive Abilities Test (a locally developed production test) were also obtained. Computation of MLU
The evaluation of MLU computing procedures focuses on the unit of counting. MLU in English is often thought to be an index using morpheme as the unit.
However, a closer examination shows that in most cases, word is the unit and only a small class of inflectional morphemes (i.e. Brown's 14 morphemes) enjoyed a special status. For our purpose here, word counting is adopted. A second measure, MLU in syllable, is also used because of the high correspondence in syllable unit and writing unit in Chinese. These two MLU variants are compared according to their correlations with age and the correlations between them.
MLU Variants
Two MLU variants are computed in order to evaluate their effectiveness as the basic unit of utterance length. They are:
a. MLU in syllable (MLUs) b. MLU in word (MLUw)
a. MLUs
MLUs is self-explanatory. Every syllable is counted, regardless it is a reduplication, a place or a translation. For example, 媽 媽 來 is counted as a 3-unit utterance while 她 來 is two; a place name like 國父紀念館 contains five units while 麥當勞 is three.
b. MLUw
First, MLUw (MLU in word) counts one unit for republications, names of a person or a place. For example, 2 units will be counted for an utterance like, 媽媽 跑跑跑; one unit for 麥當勞 . 動物園. The unit of word used here follows the definition by Chao (1968) and Chu (1982). In brief, a word has the following with three structural properties:
a. Minimal free form
It is the smallest unit that can form an utterance. Chao (1968) also suggested a pause-insertion test. If a pause can be inserted between a bisyllabic item, there are two words.
b. Expandability
If another lexical item can be inserted in between a bisyllabic item, two words will be counted. For example, 不 can be inserted into 看 完 as 看 不 完 ﹐so 看 完 are two words. 新 衣 服 is counted as two words for 的 can be inserted to form 新 的 衣 服 without changing the meaning. However, 鐵 路 is one word because 鐵 的 路 is something different.
c. Versatility
Compounds that have a limited combinations will not grant a word status. For example, 睡 in 睡 覺 shows a restricted combination. Therefore 睡 覺 is one word. 散 步, 理 髮 are examples of the same kind. The principle of versatility rules over expandability in these examples. Besides, because of the isolating
characteristics of Chinese, a group of bound morphemes, such as 了 ﹐的 ﹐著 ﹐們 ﹐ will not be counted as words if we follow the above principles. These morphemes, in some sense, parallels Brown‘s 14 morphemes and may stand for important
developmental changes in child language data. Therefore they are counted separately words. The differences between MLUs and MLUw are shown in Table 2.
Table 2 Differences between MLUs and MLUw
Type Example MLUs MLUw
1. V+V (free) 進 來, 出 去, 2 1 2. V+V (bound) 忘 記, 知 道 2 1 3. V+N (productive) 看 書, 買 菜 2 2 4. V+N (restricted) 跳 舞 , 跑 步 2 1 5. Noun (names) 長褲, 火車 , 茶杯, 動物園 2/3 1 6. 5.Noun (Location A) 桌子 上, 房間 裡, 3 1 7. Noun (Location B) 上 面, 外 面, 這 裡, 2 1 8. Number + classifier 一 個, 兩 隻, 這 個 2 1 9. Determiners + Nouns 這 隻 牛, 那 本 書 3 2 10. Pronoun (I) 我, 你, 他, 自己 1 1 11. Pronoun (II) 我們, 他們 2 2 12. Adjective 漂亮, 黑黑, 好 1 1 13. Negation 不, 沒, 不要, 沒有 1/2 1 14. Adverbs 很, 非常, 已經, 1/2 1 15. Time 今天, 昨天, 天天 2 1 16. Conjunction 可是, 因為, 所以, 跟 2/1 1 17. Grammatical Part. 的 ,了, 著 ,過 1 1
Rules for calculating MLU
The first 100 utterances are counted. Totals of length unit are summed and divided by 100. However, some utterances are excluded:
a. Immediate repetition of adults' speech. For example: Adult 我 們 去 拿 飛 機.
b. Recitation of nursery rhymes. For example:
Child 小 老 鼠 上 燈 台 , 偷 油 吃 下 不 來 . c. A list fo numbers or objects. For example:
Adult 我 們 來 數 數 看 Child 1234567.
d. Interrupted utterances.
e. Partially or totally unintelligible utterances.
Study I
Subjects
Five children have been recruited. They all live in the Great Taipei Metropolitan area, using Mandarin Chinese as their first language. Their demographic information is shown in Table 1.
Table 1. Demographic information of the Five Subjects
ID Age (by Oct. 1994) Gender
XU 1;6 Male CHOU 2;0 Male LIN 2;2 Female WANG 2;5 Female CHEN 3;6 Male Speech Samples
All five children were visited once a month. Each visiting session lasted for at least one hour. Researchers played with these children while audio-recording their speech. For most of the time, parents were present, taking care of domestic duties or playing with their children. In total, 58 one-hour speech samples have been
after the visit. Transcriptions were edited directly on personal computers.
Transcription Format
Computer files are edited in a format conforming to PAL (Pye 1987). PAL is a set of computer programs that can provide a preliminary analysis of a child's language sample. It can provide word frequencies, lexical and syntactic lexicon. A
computer program that meets the technical requirements of PAL has been written to count MLU in Chinese by the principal investigator of this project. After the speech samples were properly segmented, MLU scores and lexical concordance were
generated by these computer programs. A sample of PAL file and a MLU value output file are presented in Appendix I and Appendix II.
3. Results
MLU values from each data set will be reported individually. Then, come the results on correlational analyses. Finally all five data sets will be pooled together to examine the general growth pattern of MLU.
Individual data set (A) CHOU
Chou's first MLUw is 1.93 and his highest MLUw is 3.19 (at 30 months). Neither MLUw nor MLUs correlates significantly with age.
Table 3. Chou's MLUw and MLUs Age in Months MLUw MLUs
25 1.93 2.86 26 1.82 2.89 27 1.98 3.2 28 2.63 3.89 29 2.12 3.35 30 3.19 4.7 31 2.29 3.35 32 2.01 3.2
33 2.54 4.09
MLUw and Age: r = .5234 p = .148 MLUs and Age: r = .4358 p = .241
(B) CHEN
CHEN's first MLUw is 2.95 and his highest MLUw is 3.71 (at 49 months). His MLUw increases steadily between 42 months and 46 months. Then there is a drop. MLUw nearly reaches a significant correlation with age.
Table 4. Chen's MLUw and MLUs Age in Months MLUw MLUs
42 2.95 3.98 43 2.92 3.73 44 3.46 4.32 45 3.42 4.23 46 3.69 4.99 47 3.06 3.64 48 3.16 4.36 49 3.71 4.9 50 3.52 4.42 51 3.68 4.71
MLUw and Age: r = .6281 p = .052 MLUs and Age: r = .5352 p = .111
(C) WANG
Wang's first MLUw is 3.48 and it fluctuates between 3.30 and 4.60. Neither MLUw nor MLUs correlates significantly with age.
Table 5. Wang's MLUw and MLUs Age in Month MLUw MLUs
29 3.48 4.72
31 3.66 4.58 32 3.53 5.08 33 3.74 5.12 34 3.53 4.97 35 4.43 5.78 36 3.49 4.86 37 4.51 6.01 38 3.30 4.61 39 3.57 4.52 40 4.59 6.17
MLUw and Age: r = .3034 p = .336 MLUs and Age: r = .3414 p = .277 (D) XU
XU is first MLUw is 1.11 and his highest MLUw observed is 2.52 (at 28 months). Both MLUw and MLUs correlate significantly with age.
Table 6. Xu's MLUw and MLUs
Age in Month MLUw MLUs
18 1.11 1.97 19 1.08 1.71 20 1.18 1.71 22 1.18 2.1 23 1.73 2.43 24 1.71 2.82 25 1.77 2.57 26 2.01 2.96 27 2.36 3.29 28 2.52 3.85 29 2.43 3.59
MLUw and Age: r = .9610 p = .000 MLUs and Age: r = .9469 p = .000
(E) LIN
Lin's first MLUw is 1.6 and it goes up steadily, with a sudden rise at 30 months (MLUw = 2.37). Only MLUw correlates with age significantly.
Table 7. Lin's MLUw and MLUs
Age in months MLUw MLUs
26 1.6 2.59 27 1.53 2.45 28 1.84 2.78 29 1.87 2.61 30 2.37 3.8 31 1.99 3.01 32 2.13 2.91 33 2.03 2.53 35 2.25 2.81
MLUw and Age r = .7770 p = .014 MLUs and Age r = .635 p = .635
Pooled Data
Of the five data sets, two of them show significant correlation with age. Since each data set has a limited range of distribution, which may technically blurs the corrletation between MLUw and age. Therefore, we pooled all the samples together and did another correlaton analysis. The result showed that both MLUw and MLUs correlate with age significantly (MLUs: r = .6340, p < .001; MLUw: r = .7125; p <.001.) Besides, we found that MLUs highly correlates with MLUw (r = .9766; p < .001).
Figure 1 Pooled MLUw and MLUs
MLUW and Age
AGE
60 50 40 30 20 10MLUW
5.0 4.5 4.0 3.5 3.0 2.5 2.0 1.5 1.0 5. DiscussionFifty-eight speech samples from five children have been analyzed. Results from individual data set indicate that MLUw correlates with age when MLUw is at a lower range. Xu and Lin's MLUws are below 3.00. Wang's MLUw is above 3.00 when we started the research and her MLUw does not correlate with age. Chen's first MLUw was just below 3.00 and his MLUw-age correlation is marginally
insignificant. Chou's data set is not easy to comprehend: his MLUw starts at 1.93 (at 25 months) and goes up to 3.19 (at 30 months). Then it falls down to 2.01 (at 32 months). Data from the first four children suggest that MLUw 3.0 is a point wherre utterance length does not increase hand in hand with age. From Wang's data, we can also hypothesize that there is a period of flucutuation in MLUw, probably between MLUw 3.0 and 5.0, where MLUw may become more sensitive to other contextual factors, like the acitivities in the recording session and the child emotive status.
From Figure 1, we found that there appears to be a period of plateau in the growth of MLUw. There are two possible accounts on the flattening of the MLU growth curve. First, children may use more specific words and more complex constructions after MLUw 3.0. Their language development may not be reflected in utterance length but in the semantic and syntactic complexity of phrasal constructions. They may learn a lot of specific names for things or places, and they can construct more complex noun phrases like 我 的 杯 杯 for an earlier term 佳 錚 杯 杯 (the girl's name is 佳 錚 ) . The overall utterance length may remain the same but their richness in syntax and semantic has advanced to a higher level. On the other
hand, capacity limitations can also lead to the stop in MLU growth. However, we do not have any information on these children's memory span at the time of recording and can not evaluate this account.
Since our data stopped at 51 months (from Chen), we do not know whether MLU will increase after this point. Based on our results, three hypothetical growth curves can be generated, as depicted in Fig. 1.
Figure 2. Hypothetical MLU Growth Curves
A B Age MLU C Adult Stage 3.0
If MLU should increase as line A shows, there would be errors in the computing procedure. Line B indicates that MLU will continue to rise after a plateau. This plateau means that MLU values vary within a certain range, with no observable growth. In our data, the period of plateau is found in Wang and Chen's data. Line C shows that no substantial increase in MLU after a certain critical value (which is estimated to be 3.0 from our data). These three growth curves will be examined by the second year data.
The most interesting finding is that MLUs correlates highly with MLUw. One interpretation is that there is a high percentage of monosyllabic monomorphemic word in the data. It also indicated that the MLU values reported in the various previous studies can be compared with each other, may it be that one used word as length unit and the other syllable.
6. Conclusion
This study has examined the validity of MLU. Results reported here cover the data from five Mandarin-speaking children growing up in Taiwan. A total of 58 speech samples were collected, from which two MLU variants: MLUw and MLUs were computed. Interestingly, MLUw and MLUs also correlate with each other highly. Correlational analyses show that MLUw correlates with age significantly. Results from individual data set indicates that MLUw is a good index at the early stage of language development. From our data here, we tentatively suggest that MLU 3.0 is the upper limit of application. This, of course, has to be verified by the future studies.
References
Brown, R. 1973. A First Language: The Early Stages. Cambridge, MA: Harvard University Press.
Chao, Y-R. 1968. A grammar of spoken Chinese. Univ. of California Press.
Cheng, S. 1988. Beginning negative sentences among Mandarin speaking toddlers: With special reference to the differentiation between "BU" and "MEI YOU". Chinese Journal of Psychology, 30, 47-63.
Chu, D. X. 1982. Yufa Jiangyi. (朱 德 熙 。 語 法 講 義。 北京商務印書 館 )
Dromi, E. & Berman, R. 1982. A morphemic measure of early language development: Data from Hebrew. Journal of Child Language, 9, 403-24.
Erbaugh, M. 1982. Coming to order: Natural selection and the origin of syntax in the Mandarin speaking child. Unpublished doctoral dissertation,: University of California, Berkeley.
Hicky, T. 1991. Mean length of utterance and acquisition of Irish. Journal of Child Language, 18, 553-69.
Hsu, H. 1986. A study of the various stages of development and acquisition of Mandarin Chinese by children in Chinese millieu. NSC report.
Linares, N. 1983. Rules for calculating Mean Length of Utterances in morphemes for Spanish. In D. R. Omark & J. R. Erickson (Eds.) The Bilingual Exceptional Child. San Deigo: College-Hill Press.
Scarborough, H. S., Rescorla, H., Tager-Flushberg, A. E. Fowler and Sudhalter, V. 1991. The relation of utterance length of grammatical complexity in normal and language-disordered groups. Applied Psycholinguistics, 12, 23-45. Wang, Q., Lillo-Martin, D., Best, C., & Levitt, A. 1992. Null subjects versus null
object: Some evidence from the acquisition of Chinese and English. Language Acquisition, 2, 221-254.
Appendix I
A Sample Speech File in PAL Format
$ Child, Examiner, Another examiner, Mother + Child 吳永秀 2;8 E 你告訴阿姨哦, 中間是哪一個? E 這個. E 哪一個? E 哪一個是中間? C 中間 的 這 個 這 個. E (S) Hou, 三個哦. E 前面是哪一個? C 這 個 啊. E 阿後面是哪一個? C 這 個. E 嘿. E 阿中間是哪一個? E 哪一個是中間? E 中間. +OVERLAP C <XXX>. E 那個洞那裡是中間. E (跟媽媽講話). E 來. E 來, 現在阿姨阿姨做哦哼. E 好, 現在阿姨說你把小汽車放在中間. E 三個哦, 那小汽車要排在中間, 你排給阿姨看. E 哪一個排前面啊? E 好. E 它要去哪裡? E 它要去哪裡? E 那這個勒, 這個誰開啊, 這個誰開? C 這 個 叔叔 開 這 個 也 是 加 這 個 的. E 阿那阿姨坐哪裡啊? C 坐 這 裡. E 後面啊, 你讓阿姨坐後面啊. E 蛤, 阿姨坐後面啊. +C (笑).
E 是不是? C 坐 這 裡 啦. C 我 坐 這 裡.
Appendix II
Output File: MLU Computing Program
***** MLU of c:\ryang\ryang22.dat***** 08-03-1994 5 C 從 這 裡 關 ++units counted: 4 7 C 關 起 來 ++units counted: 3 9 C 巴比 關 起 來 就 沒有 聲音 了 ++units counted: 8 11 C 巴比 一 關 就 沒有 聲音 了 ++units counted: 7 15 C 要 去 找 媽媽 ++units counted: 4 17 C 不 要 嘛 ++units counted: 3 27 C 不 要 ++units counted: 2 29 C 我 要 阿姨 抱 ++units counted: 4 37 C 坐 好 了 ++units counted: 3 52 C 這 個 ++units counted: 2 54 C 開 ++units counted: 1 79 C 這 個 ++units counted: 2 81 C 放大鏡 啦 ++units counted: 2 83 C 要 關 起 來 啦 ++units counted: 5 94 C 拉 不 開 ++units counted: 3 101 C 什麼 ++units counted: 1 108 C 姨嬤 ++units counted: 1 110 C 姨嬤 ++units counted: 1 112 C 巴比 姨嬤 ++units counted: 2 115 C 媽媽 ++units counted: 1 118 C 巴比 要 看 媽媽 ++units counted: 4 121 C 媽媽 在 這 裡 ++units counted: 4 123 C 媽媽 在 笑 ++units counted: 3 125 C 媽媽 ++units counted: 1 127 C 媽媽 回 來 了 ++units counted: 4 130 C 這 個 媽媽 回 來 了 ++units counted: 6 137 C 爸爸 勒 ++units counted: 2 142 C 姨嬤 媽媽 ++units counted: 2 144 C 這 個 是 媽媽 ++units counted: 4 146 C 失蹤 了 ++units counted: 2 148 C 媽媽 ++units counted: 1 150 C 媽媽 ++units counted: 1
153 C 媽媽 ++units counted: 1 155 C 我 要 找 媽媽 啦 ++units counted: 5 162 C 巴比 找 媽媽 ++units counted: 3 165 C 巴比 找 媽媽 ++units counted: 3 169 C 不 要 嘛 ++units counted: 3 171 C 要 關 起 來 ++units counted: 4 173 C 要 這 個 ++units counted: 3 175 C 關 不 起 來 ++units counted: 4 184 C 媽媽 ++units counted: 1 191 C 媽媽 ++units counted: 1 194 C 好 ++units counted: 1 199 C 阿姨 背 去 找 媽媽 ++units counted: 5 222 C 媽媽 ++units counted: 1 224 C 媽媽 ++units counted: 1 228 C 媽媽 呢 ++units counted: 2 230 C 媽媽 ++units counted: 1 232 C 媽媽 ++units counted: 1 236 C 巴比 這 個 小紅帽 睡覺 ++units counted: 5 238 C 巴比 這 個 小紅帽 睡 在 這 裡 ++units counted: 8 247 C 這 個 不 好 ++units counted: 4 248 C 這 個 巴比 把 他 拿 開 來 了 ++units counted: 9 254 C 找 媽媽 ++units counted: 2 256 C 要 去 找 媽媽 ++units counted: 4 258 C 要 找 媽媽 ++units counted: 3 260 C 不 要 嘛 ++units counted: 3 262 C 接 好 了 ++units counted: 3 265 C 不 要 嘛 ++units counted: 3 269 C 阿姨 去 救 他 ++units counted: 4 273 C 嘿 ++units counted: 1 276 C 媽媽 ++units counted: 1 288 C 躲 好 了 ++units counted: 3 291 C 下 去 去 找 媽媽 ++units counted: 5 293 C 下 去 去 找 媽媽 ++units counted: 5 299 C 阿姨 抱 ++units counted: 2 301 C 阿姨 抱 ++units counted: 2 305 C 要 去 找 媽媽 ++units counted: 4 311 C 錄音機 壞 掉 ++units counted: 3 313 C 伯伯 要 拿 去 修理 了 ++units counted: 6
320 C 好 ++units counted: 1 323 C 修 就 &會 會 打 破 掉 呢 ++units counted: 7 324 C 修 都 會 破 掉 呢 ++units counted: 6 328 C 阿姨 把 錄音機 放 好 ++units counted: 5 331 C 要 去 找 媽媽 ++units counted: 4 341 C 阿姨 收 在 那 裡 ++units counted: 5 354 C 小偷 呢 ++units counted: 2 356 C 小偷 ++units counted: 1 363 C (T) Pa ++units counted: 1 367 C 砰 ++units counted: 1 374 C 巴比 就 砰 小偷 ++units counted: 4 377 C 巴比 就 拿 槍 砰 小偷 ++units counted: 6 379 C 對 ++units counted: 1 386 C 要 打 電話 呢 ++units counted: 4 388 C 對 ++units counted: 1 394 C 剪刀 ++units counted: 1 396 C 紅 色 的 剪刀 ++units counted: 4 398 C 紅 色 的 剪刀 ++units counted: 4 414 C 剪刀 可以 剪 &剪 &剪 &剪 ++units counted: 3 419 C 對 ++units counted: 1 425 C 要 找 媽媽 嘛 ++units counted: 4 432 C 巴比 脖子 上 面 癢 啦 ++units counted: 6 434 C 巴比 脖子 上 面 好 癢 ++units counted: 6 442 C 不 好 嘛 ++units counted: 3 451 C 要 去 找 媽媽 ++units counted: 4 453 C 不 要 ++units counted: 2 455 C 不 要 救 爸爸 嘛 ++units counted: 5 461 C 阿姨 去 救 爸爸, 好 不好 ++units counted: 6 470 C 媽媽 開 自己 的 車 ++units counted: 5 475 C 不 要 嘛 ++units counted: 3
Total lines: 100 Total units: 319
************************************************************** The MLU of c:\ryang\ryang22.dat is: 3.19
**************************************************************
NATIONAL TAIWAN UNIVERSITY
GRADUATEINSTITUTEOFLINGUISTICSTAIPEI 106, TAIWAN. REPUBLIC OF CHINA (02) 23661381
February 6, 1998
Department of Speech and Hearing Science The University of Hong Kong
Hong Kong
Dear Ms Poon,
I write to submit for a presentation in the First Asia-Pacific Conference on Speech, Language and Hearing. Enclosed are copies of abstract and a disk. Please let me know if any other information is needed.
Yours Sincerely,
Hintat Cheung, Ph.D. Associate Professor
Graduate Institute of Linguistics National Taiwan University Taipei 106, Taiwan
Phone (886) 02-23660231 ext. 3477 Fax (886) 02-23635358