• 沒有找到結果。

Chapter 4 The Profile of Children’s Early Lexicon

4.1. Analysis of children’s vocabulary growth

4.1.1. Vocabulary size in all stages

Children’s vocabulary size is reported in this section. The frequency distribution was examined by two dimensions, POS and age stage. The following table provides the word frequency and proportion of main POS in the analyzed data: Noun, Verb, ADV, CL, QN, ADJ, CONJ, PREP, WH, and MOD. Other parts-of-speech were classified into the category called others, because they may not be able to reflect children’s language development, such as idioms (IDM), sentence final particles (SFP), interjections (INT), and onomatopoeia words (ONM). POS which had fewer than 10 types was also classified into others, such as determiners (DT), aspect markers (ASP), nominal markers (DE), and adverb of negation (NEG).

Vocabulary size of each POS

Vocabulary sizes of all parts-of-speech are illustrated here. Table 4.1 presents the word frequency and proportion of each POS in the language samples. Table 4.1 shows that nouns and verbs are the main POS. In the aspect of word tokens, the database contains mainly nouns and verbs, about 58%, followed by classifiers 4.8% and adverbs 4.1%. Quantifiers, WH-words, and modals are about 5.2% of all tokens. The number of tokens is fewer than 1000 in ADJ, CONJ and PREP. Others are about 25% of all tokens, mainly from SFP, INT, DT, ASP, DE, and NEG. However, the type frequency reveals that 1892 noun types and 1444 verb types contribute about 83% among all word types.

The third main POS is adverb, 109 different types. The remaining 7 parts-of-speech

contain less than 100 different types in each of them. The 7 word classes account for about 6.0%. The POS in the group of others accounts for 8.1% of all word types.

Table 4.1: The total number of main POS.

POS Word Token Percentage Word Type Percentage TTR

Noun

20543 27.1% 1892 47.0% 0.092

Verb

23513 31.0% 1444 35.9% 0.061

ADV

3098 4.1% 109 2.7% 0.035

CL

3616 4.8% 64 1.6% 0.018

QN

1344 1.8% 60 1.5% 0.045

ADJ

554 0.7% 49 1.2% 0.088

CONJ

811 1.1% 28 0.6% 0.035

PREP

790 1.0% 21 0.4% 0.027

WH

1453 1.9% 17 0.4% 0.012

MOD

1121 1.5% 14 0.3% 0.012

Others

18896 24.9% 328 8.1% 0.017

SUM 75739 100% 4026 100% 0.053

The high TTR reveals that the vocabulary diversity is higher in nouns, 0.092, verbs, 0.061, and adjectives, 0.088. Referring to the number of types and tokens, the high vocabulary diversity means that children know many different words of nouns, verb and adjectives. On the contrary, the low TTR indicates that the diversity is lower in WH-words, 0.012, modals, 0.012, POS of others such as determiners, aspect markers, nominal markers, and adverb of negation, 0.017, and classifiers, 0.018. Referring to the number of types and tokens, the low diversity means that almost all types of WH-words, modals, determiners, aspect markers, nominal markers, and adverb of negation are frequently used in children’s speech.

The vocabulary size provides some indication about 10 children’s spontaneous speech from 1.5 to 4 years old. First of all, nouns and verbs are the major types of children’s vocabulary, accounting for more than 80%. Children also frequently use nouns and verbs, accounting for about 60%. They also frequently used adverbs and

classifiers in their speech, about 9%. Besides, the proportion of tokens of others, 25%, indicated that children frequently used sentence final particles, interjections, determiners, aspect markers, nominal markers, and adverb of negation.

Word frequency and TTR in each age stage

The frequency distribution in different age stages was also examined and is presented in Table 4.2, including type frequency, token frequency and TTR in each age group. Data in 19-24 months contains a total number of 5657 word tokens; 19726 words in 25-30 months, 16669 words in 31-36 months, 23431 words in 37-42 months. Data in the 43-48 months contains 10256 word tokens. In terms of the number of word types in an age stage, 815 word types were collected in 19-24 months, 1774 word types in 25-30 months, 1668 word types in 31-36 months, 2028 word types in 37-42 months, and 1138 word types were collected in 43-48 months. Among 4026 word types in the language samples, data in 19-24 months accounts for 20.2%; data in 25-30 months and 31-36 months accounts for more than 40% words; data in 37-42 months even accounts for more than 50%. Data in 43-48 months accounts for 28.3% word types.

Table 4.2: The total word frequency and TTR in each age stage.

Frequency 19-24 months

25-30 months

31-36 months

37-42 months

43-48 months

Type

815 1774 1668 2028 1138

20.2% 44.1% 41.4% 50.4% 28.3%

Token

5657 19726 16669 23431 10256

7.5% 26.0% 22.0% 30.9% 13.5%

TTR

0.144 0.09 0.1 0.087 0.111

As for TTR, the ratio in 19-24 months is the highest, 0.144 which means vocabulary diversity is comparably higher in children’s earlier stages when they have been acquiring new words in these stages. Thus, the possibility of children producing a

new

The frequency distribution of each POS and the corresponding TTR in every age group are provided in Table 2, Table 3 and Table 4 in Appendix 3. Generally, the total frequency of POS in each age group is also an M-shaped distribution. But some parts-of-speech with increasing types and tokens rather than an M-shaped distribution may imply a development tendency with age, such as ADV, CL, QN, ADJ, CONJ, PREP, and WH.

Cumulative vocabulary

Cumulative vocabulary of each child and the mean vocabulary size in each age stage are provided in Table 4.3. As seen in Table 4.3, children’s mean cumulative vocabulary is 290 in 19-24 months, 552 in 25-30 months, 741 in 31-36 months, 868 in 37-42 months, and 882 in 43-48 months. The rate of vocabulary growth is higher in earlier stages before 42 months. However, the rate of vocabulary growth slows down in 43-48 months. Figure 4.2 shows that the vocabulary growth before 42 months is a steep distribution, while the growth after 42 months is a flat distribution.

Table 4.3: Children’s cumulative vocabulary.

Child Age Session 19-24 months

25-30 months

31-36 months

37-42 months

43-48 months

XU 1;7-2;5 10 284 474 - - -

YANG 1;7-2;9 11 222 444 623 - -

WU 1;7-2;10 12 420 811 912 - -

CHOU 2;1-3;4 16 - 772 1058 1369 -

WANG 2;5-3;4 12 - 472 871 1085 -

JC 2;2-3;5 14 - 444 604 951 -

CHENG 3;1-3;11 11 - - - 776 1146

CHW 3;6-4;0 6 - - - 352 646

Pan 1;7-3;9 19 233 444 694 916 982

WUYS 2;7-3;10 10 - - 425 625 752

Mean 290 552 741 868 882

Note: The symbol “-” means no data was recorded.

4.1.2

show

The number of word tokens of each POS was also computed. Table 4.5 is the average word token frequency of POS in age stages. As seen in Table 4.5, the mean total word tokens is 297.7 tokens in 19-24 months, 636.3 tokens in 25-30 months, 595.3 tokens in 37-42 months, and 732.6 tokens in 43-48 months. The frequency distribution is an M-shaped distribution. There was a tremendous increase in word tokens in 25-30 months and 37-42 months, but a decrease in 31-36 months and 43-48 months. Since the word frequency was an average value, the M-shaped pattern may not result from different sample sizes in the stages; instead, it may be children’s real performance before they were 4 years old.

Table 4.5: The mean token frequency of POS in each stage.

POS 19-24 months 25-30 months 31-36 months 37-42 months 43-48 months

Noun 93.0 188.6 154.3 207.3 185.6

Verb 97.7 217.1 183.1 237.3 208.6

ADV 5.3 21.3 25.6 41.0 30.6 CL 11.7 24.3 31.9 39.9 42.1

QN 5.4 7.7 13.9 15.4 11.9

ADJ 1.8 5.9 3.9 5.8 4.1

CONJ 1.0 3.8 8.0 12.2 6.9

PREP 1.6 5.9 7.6 9.3 6.8

WH 4.4 10.9 13.4 14.3 17.4

MOD 1.1 9.1 8.0 14.3 12.6

Others 74.7 141.8 145.6 211.1 205.9

Total 297.7 636.3 595.3 808.0 732.6

As seen in the above table, children produced more nouns and verbs than other POS in all stages. They produced about a mean of 93 nouns and 97 verbs in 19-24 months, but they produced about more than 100 nouns and verbs in later stages. The frequency distributions of nouns, verbs, adjectives, and modals are also an M-shaped distribution. The declined token frequency in 31-36 months and 43-48 months may imply that children learn to use other methods to express their ideas, such as using more vocabulary in other word classes. This seems to be the case that the number of word

token

fewest number of words are in 19-24 months, but the number increases with aging in all POS. Besides, words in 37-42 months have the highest number of words for most POS.

4.1.3. TTR in all stages

The diversity of children’s vocabulary is reported in this section. If children acquire new types, then the TTR will increase; meanwhile, if children acquire no more new types and repeatedly use words they have known, then the TTR will decrease. The Table 4.6 below offers the mean TTR of POS in each age group.

Table 4.6: The mean TTR of each POS in each age stage.

19-24 months 25-30 months 31-36 months 37-42 months 43-48 months

Noun 0.37 0.33 0.35 0.35 0.33

Verb 0.34 0.31 0.37 0.34 0.34

ADV 0.51 0.41 0.44 0.38 0.42 CL 0.22 0.21 0.15 0.1 0.11 QN 0.27 0.45 0.38 0.39 0.46 ADJ 0.36 0.41 0.48 0.51 0.49

CONJ 0.13 0.47 0.43 0.43 0.69

PREP 0.33 0.4 0.45 0.55 0.59

WH 0.32 0.51 0.45 0.33 0.39 MOD 0.47 0.37 0.37 0.24 0.25

Others 0.24 0.18 0.17 0.14 0.15

All POS 0.37 0.33 0.35 0.35 0.33

The highest TTR of a POS among five age stages were examined. Nouns, ADV, CL, MOD, and others have a higher TTR in the earliest age group, 19-24 months. The highest TTR of WH-words is in 25-30 months. Verbs have the highest TTR in 31-36 months. Adjectives have the highest TTR in 37-42 months. QN, CONJ, and PREP have the highest TTR in 43-48 months. The result of TTR shows that content words like nouns and verbs have a mid TTR (0.3 < TTR < 0.4). Function words and abstract content words (WH, CONJ, QN, PREP, ADV, ADJ, MOD) have a high TTR (> 0.4).

Function words like classifiers and others have a low TTR (< 0.3).

Figure 4.5 is the line graph of the TTR of POS in each age stage. Nouns, verbs, and

other

4.1.4. Development of POS across month stages

The result of mean proportion of POS in each age stage is reported here. Table 4.7 and Table 4.8 provide the mean proportion of POS in each age stage. Wordlist of POS except for nouns and verbs are in Appendix 4.

Table 4.7: The mean proportion of POS types in each age stage.

19-24

Table 4.8: The mean proportion of POS tokens in each age stage.

19-24

Nouns: As seen in Table 4.7, children produced 36.6% noun types in the stage

19-24 months, 34.2% in 25-30 months, 30.2% in 31-36 months, 32.1% in 37-42 months and 30.8% in 43-48 months. The noun proportion decreases gradually before children’s 36 months, later increases in the following six months before they were 42 months old,

and decreases again before 48 months old. In the aspect of tokens, a similar trend is shown. The proportion of noun tokens in the 19-24 months is the highest 31.5%, and it decreases in later stages, less than 30%. The proportion is about 25.5% after 31 months.

Verbs: Children produced 35.8% verb types in 19-24 months, 38.2% in 25-30

months, 38.1% in 31-36 months, 36.1% in 37-42 months, and 35.6% in 43-48 months.

The verb proportion increases in 25-30 months, and it does not change much in 31-36 months. In later stages, the proportion of verb types decreases gradually. On the contrary, the token proportions of verb before 30 months, 32.8% and 34.2%, are higher than those in the later stages.

Adverbs: Children produced 2.4% types of adverb in 19-24 months, 4.0% in 25-30

months, 5.7% in 31-36 months, 6.5% in 37-42 months, and 5.9% in 43-48 months. The proportion of adverb increases gradually until children were 42 months old, and it decreases after their 42 months. The token proportion of ADV shows a similar trend as type proportion. Children produced 1.5% tokens of ADV, and gradually increased the proportion until 37-42 months, 5%, and then decreased the proportion in 43-48 months.

Classifiers: Children produced 1.5% types of classifier in 19-24 months and 25-30

months, 2.3% in 31-36 months, 1.8% in 37-42 months, and 1.7% in 43-48 months. The proportions of classifiers are the same in the first two stages. After children’s 30 months, the proportion increases in 31-36 months, but it decreases gradually in the following two stages, after they were 36 months old. On the contrary, children produced more proportion of tokens than types of CL. They produced 4.5% CL in 19-24 months, 3.3%

in 25-30 months, 5.4% in 31-36 months, 5% in 37-42 months, and 5.5% in 43-48 months. Although children produced fewer types of CL, they frequently used CL.

Quantifiers: Children produced 1.9% quantifier types in 19-24 months, 1.5% in

The proportion of quantifier types decreases in 25-30 months, but it increases gradually before 42 months. The proportion decreases slightly again in 43-48 months. The proportion of quantifier tokens shows a similar trend, but the highest proportion is in 31-36 months, 2.5%. In the later stages, the proportion decreases to 1.8%.

Adjectives: Children produced 1.0% adjective types in 19-24 months, 1.1% in

25-30 months, 1.0% in 31-36 months, 1.1% 37-42 months, and 1.2% in 43-48 months.

The proportion of adjectives does not change much in five age stages. The token proportions of adjectives are lower than 1%. The highest token proportion is in 25-30 months, and those in the rest age stages have no difference.

Conjunctions: Children produced 0.3% conjunction types in 19-24 months, 0.9%

in 25-30 months, 1.7% in 31-36 months, 1.8% in 37-42 months, and 1.8% in 43-48 months. The proportion of conjunctions increases gradually until 42 months, and it remains unchanged in 43-48 months. A similar trend is found in token proportion of conjunctions. Children produced only 0.2% conjunction tokens in 19-24 months, and became more frequently use conjunctions with increasing ages until 37-42 month, 1.4%.

But the token proportion of conjunction decreases again in 43-48 months.

Prepositions: Children produced 0.7% preposition types in 19-24 months, 1.4% in

25-30 months, 1.6% in 31-36 months, 1.9% in 37-42 months, and 1.7% in 43-48 months.

The proportion of prepositions is the smallest in the first stage, and it increases stage by stage until 37-42 months. After 42 months, the proportion is decreased again. As for token proportion of prepositions, a similar trend is found. Children produced only 0.5%

of preposition tokens in 19-24 months, and became more frequently use prepositions with increasing ages until 37-42 month, 1.4%. But the token proportion of conjunction decreases again in 43-48 months.

WH-words: Children produced 1.3% WH-word types in 19-24 months, 1.8% in

25-30 months, 2.0% in 31-36 months, 1.8% in 37-42 months, and 2.1% in 43-48 months.

The proportion of WH-words is the smallest in the first age stage, 19-24 months, and it increases in the next two age stage before 36 months. After 36 months, the proportion is reduced slightly in 34-42 months, but it increases again in 43-48 months, in which the proportion of WH-words is the highest. A similar trend of token proportion in WH-words is found. The proportion is lowest in 19-24 months, 1.3%, and it increases to 2.4 in 31-36 months. In the next age stages, the proportion of WH-tokens decreases in 37-42 months, and then increases again in 43-48 months.

Modals: Children produced 0.7% modal types in 19-24 months, 1.2% in 25-30

months, and 1.3% in 31-36 months, 37-42 months and 43-48 months. The proportion of modals is the smallest in the first age stage, 19-24 months, and it increases in the following three stages before 36 months. After 36 months, the proportion of modals keeps unchanged in the following stages. In terms of proportion of modal tokens, children produced fewer modals in 19-24 months, 0.3%. However, in later stages, the proportion increases to 1.3% and 1.9% in 37-42 months. It decreases slightly to 1.8% in 43-48 months.

Others: Other POS types accounts for 17.8% in 19-24 months, 14.1% in 25-30

months, 13.9% in 31-36 months, 13.1% in 37-42 months, and 15.5% in 43-48 months.

The proportion of other POS types is the highest in the first age stage, 19-24 months, and it decreases in the next three stages before 42 months. The proportion increases again in 43-48 months, 15.5%. On the contrary, the proportion of tokens shows a different trend. The proportion is 25% in 19-24 months, and it falls to 23.1%. After 30 months, children increase the tokens gradually. The highest proportion is 27.4% in 43-48 months.

Therefore, proportions of word frequency can be used to present the development trend of children’s early Mandarin vocabulary. In early stage, 19-24 months, nouns and verbs have the highest proportion of children’s vocabulary. After children’s 30 months, the proportion of nouns and verbs will decrease. The proportion of 8 POS (ADV, CL, QN, ADJ, CONJ, PREP, WH, and MOD) is a small proportion but it will increase after children’s 2 years old. The proportion of other POS is another story. The proportion of types decreases after 24 months, but the proportion of tokens decreases in 25-30 months first and it increases after 30 months. The results of children’s POS distribution indicates that children acquire content words first and frequently use them in early ages, whereas they acquire function words later, and increase the use of function words when they are older.

4.1.5. N/V ratio in all stages

Noun/verb ratio in each age stage is reported in this section. Table 4.9 provides the mean NVR of four types of noun/verb ratio in each age stage. As shown in Table 4.9, the mean NVR1 of 19-24 months showed the greatest value (M = 1.06, SD = 0.31), indicating a noun bias. NVR1 in the following stages implies a verb bias. The standard deviation shows that the biggest within group difference is 0.31 in 19-24 months. The later stages show a fluctuation in within group difference. It seems that children’s individual difference is not steady.

NVR2 indicates a noun bias in all stages. The values are 1.4 in 19-24 months, 1.25 in 25-30 months, 1.02 in 31-36 months, 1.19 in 37-42 months, and 1.16 in 43-48 months.

The lowest value reveals nearly no bias. The standard deviation shows the biggest within group difference is 0.45 in 19-24 months. The difference decreases to 0.27 gradually until 37-42 months. The difference increases to 0.35 again in 43-48 months. It

reveals that the children’s individual difference gets smaller with increasing age.

NVR3 indicates a verb bias throughout five age stages. The values are 0.84 in 19-24 months, 0.68 in 25-30 months, 0.59 in 31-36 months, 0.67 in 37-42 months, and 0.64 in 43-48 months. The NVR3 in 19-24 months is the highest, indicating a weak verb bias, while the lowest value in 31-36 months indicates a strong verb bias. The standard deviation shows that the biggest within group difference is 0.30 in 19-24 months. In the later stages, it shows a fluctuant individual difference.

NVR4 has a similar tendency as NVR1. The NVR4 indicates a noun bias in 19-24 months and a verb bias in later age stages. The standard deviation shows that the biggest within group difference is 0.40 in 19-24 months and the difference decrease to 0.22 gradually until 37-42 months. The difference increases to 0.34 again in 43-48 months. It seems that the children’s individual difference gets smaller as time goes by.

Table 4.9: Children’s NVR.

19-24 months

25-30 months

31-36 months

37-42 months

43-48 months

NVR1

Mean 1.06 0.92 0.82 0.91 0.89

SD 0.31 0.22 0.27 0.21 0.26

NVR2

Mean 1.4 1.25 1.02 1.19 1.16

SD 0.45 0.42 0.31 0.27 0.35

NVR3

Mean 0.84 0.68 0.59 0.67 0.64

SD 0.30 0.19 0.23 0.16 0.25

NVR4

Mean 1.1 0.92 0.74 0.88 0.83

SD 0.40 0.33 0.27 0.22 0.34

Figure 4.6 is the line graph of all NVR across five age stages. It depicts that four types of NVRs have a similar development trend. The NVRs are the highest in the early stages. 31-36 months is the turning point. NVRs decrease sharply before 31-36 months, and the ratios rise after then. As seen in the figure, NVR1 and NVR4 are extremely close which implies that including broad or strict of nouns and verbs do not have much difference. A weak noun bias is found in 19-24 months, and a verb bias is found in later

stage

D

In order to know whether age difference has effect on lexical diversity, a series of one-way ANOVA was conducted to D of all words, D of nouns, and D of verbs, with age group as the independent variable. The result of ANOVA of D of all lexicon shows that a significant difference between different age groups is found [F(4,116) = 7.905, p

<.05]. The Scheffe post hoc test reveals that the mean D value of all lexicon in 19-24 months is significantly lower than that in other age groups (p <.05). There is no significant difference among the rest four age groups. As for D values of nouns, no significant difference between age groups was found.

When it comes to D values of broad verbs, the test of homogeneity of variances shows a significant difference [F(4,116) = 3.115, p <.05], which implies that equal variances were not assumed, so the Brown-Forsythe test was used. The result shows a significant difference between different age groups is found [F(4, 110.736) = 12.115, p

<.001]. A Dunnett’s T3 post hoc test implies that the mean D value of broad verbs in 19-24 months is significantly lower than that in other age groups (p <.05). The mean D value in 25-30 months is significantly lower than that in 37-42 months (p <.05). No significant difference of other groups was found.

The result of ANOVA of D of strict verb reveals that a significant difference between different age groups is found [F(4,116) = 12.017, p <.05]. The Scheffe post hoc test shows that the mean D values of strict verb in 19-24 months and 25-30 months are significantly lower than that in other groups. There is no significant difference between 19-24 months and 25-30 months, and no significant difference among the rest age groups either.

To sum up, D values of all words, of nouns, and of verbs all indicate the development trajectory of children’s increasing lexical diversity as time goes by. A statistically significant age effect on lexical diversity is found in D values of all words

and of verbs. D values in 19-24 months and 25-30 months are significantly lower than D values in later stages. No significant difference is found between other stages.

4.2. Analysis of children’s vocabulary organization

4.2.1. Distribution of semantic categories and conceptual levels

The distribution of semantic categories and conceptual levels of children’s nouns will be illustrated in the following sections. The unidentifiable and incomplete words (64 types, 115 tokens) were excluded in the analyses. Table 4.11 provides the information of the total numbers and proportions of types and tokens in semantic categories.

As seen in Table 4.11, the top five main categories of noun types are people, tools, food, animals, and locations. These five categories account for 61.4% of children’s all vocabulary in the samples. On the contrary, the numbers of types fewer than 20 are pronouns, furniture, numerals, and shapes, accounting for only 3.4% of children’s vocabulary. The results may indicate that people, tools, food, animals, and locations appear the most frequently in children’s early life, so children have more chance to acquire nouns of these categories. On the other hand, pronouns, furniture, numerals, and shapes are categories with fewer types themselves, so children acquire fewer words of these categories than categories with rich types.

The number of tokens was calculated to know how frequently children used the words in their speech. As shown in the Table 4.11, the top five frequently-used nouns are from categories of pronouns, people, animals, spatial words, and tools. These five categories account for 74.8% of all word tokens. Pronoun is the most frequently used category in children’s speech, accounting for 29.5%. Although there are a few pronoun

Table 4.11: The total numbers of nouns in semantic categories.

People Spatial Pronouns Tools Vehicles

type

386 90 18 239 120

%

21.5% 5.0% 1.0% 13.3% 6.7%

token

4942 1367 6020 1086 863

%

24.2% 6.7% 29.5% 5.3% 4.2%

Locations Natural Clothing Shapes Body parts

type

147 62 54 11 71

%

8.2% 3.5% 3.0% 0.6% 4.0%

token

644 351 285 24 663

%

3.2% 1.7% 1.4% 0.1% 3.2%

Abstract nouns Toys Food Animals Furniture

type

129 82 172 158 17

%

7.2% 4.6% 9.6% 8.8% 0.9%

token

526 462 918 1871 106

%

2.6% 2.3% 4.5% 9.2% 0.5%

Numerals Colors SUM

type

14 25 1795

%

0.8% 1.4% 100%

token

19 281 20428

%

0.1% 1.4% 100%

Spatial words like 這裡 (zhè lǐ, here, 533 tokens) and 這邊 (zhè biān, here, 205 tokens) have a high frequency in children’s speech, resulting in the high frequency of spatial words. 這裡 (zhè lǐ, here) and 這邊 (zhè biān, here) can be used to replace body parts or locations, as illustrated in the following examples. 這邊 (zhè biān, here)

Spatial words like 這裡 (zhè lǐ, here, 533 tokens) and 這邊 (zhè biān, here, 205 tokens) have a high frequency in children’s speech, resulting in the high frequency of spatial words. 這裡 (zhè lǐ, here) and 這邊 (zhè biān, here) can be used to replace body parts or locations, as illustrated in the following examples. 這邊 (zhè biān, here)