• 沒有找到結果。

This chapter reviews previous studies from two perspectives. The first one reports

N/A
N/A
Protected

Academic year: 2021

Share "This chapter reviews previous studies from two perspectives. The first one reports "

Copied!
18
0
0

加載中.... (立即查看全文)

全文

(1)

Chapter Two Literature Review

This chapter reviews previous studies from two perspectives. The first one reports

in section 2.1 on the cross-linguistic studies on the acquisition of voicing or aspiration

in Cantonese, Thai, and English. Most of these studies focused on such research

questions as when the child acquires the contrast and at what place of articulation the

child first produces contrastive phonemes. Accordingly these studies ended up with the

appearance of voicing contrast revealed by statistical analyses. Some of them would

remark that though having produced the temporal contrast between homorganic stop

phonemes, the children did not sound like adults.

The second perspective in section 2.2 aims to review previous researches

regarding gradient development in child language acquisition. These views do not

identify the time at which the child uttered contrastive production as the completion of

development. Rather the development ends up with mature phonetic skills in realizing

abstract phonological units which approach adults’ model.

2.1 Cross-linguistic studies on the acquisition of voicing 2.1.1 Cantonese

Clumeck et al. (1981) investigate the acquisition of the phonological contrast of

aspiration in Cantonese by conducting a long longitudinal and a cross-sectional study.

The longitudinal study reported that the subject, aged 1;7 at the beginning of the study,

(2)

did not contrast aspirated and unaspirated stops until he was about 2;2. The results of

the cross-sectional study showed that aspiration contrast was acquired among the four

children under investigation aged from 2;5 to 4;0 but their means for VOT values were

not adult like.

In Cantonese, the feature of aspiration distinguishes three pairs of stop consonants

at three places of articulations: the labial stops /p/ and /p

h

/, the dental stops /t/ and /t

h

/,

and the velar stops /k/ and /k

h

/. The first member of the three pairs is characterized as

voiceless unaspirated stops while the second one as voiceless aspirated stops. To define

the VOT values of these stop consonants, Clumeck et al. recruited eight adult native

speakers of Cantonese, including four men and four women, to read a word list on

index cards and recorded their VOT values. The values for unaspirated stops fell almost

within 30 milliseconds (ms.) except for occasional tokens and the values for aspirated

stops fell between 60 to 100 ms. There was no voicing instance occurred, nor was there

overlapping VOT value between aspirated and unaspirated homorganic stops.

After established the VOT ranges of the stop consonants, Clumeck et al. studied

the acquisition of aspiration by comparing the VOT values for aspirated and

unaspirated stops collected from children’s spontaneous speech during playing with

toys and picture books. The speech data of the subject in the longitudinal study were

recorded in 15 separate sessions in eleven months while the data of the four subjects in

(3)

the cross-sectional study were recorded once and for all. According to the results of the

statistical analysis, the cross-sectional study reported that all of the four children aged

from 2;5 to 4;0 have acquired aspiration contrast but their VOT values were not similar

to adults’ values due to considerable overlapping and more variable VOT ranges. The

longitudinal study remarked three points in the process of aspiration acquisition: 1)

there was no contrast between aspirated and unaspirated stops, 2) the contrast showed

between the dental homorganic stops, and finally 3) aspiration contrast showed at all

three places of articulation. The child did not distinguish aspiration between

homorganic stops when he was between 1;7 and 2;4. His VOT values spread through

long lead (VOT < 0 ms.) to long lag region (VOT > 30 ms.). The lead voicing stops

which usually occurred at the labial place of articulation occupied approximately 40%

of all tokens at 1;7, and gradually decreased to 2.4% at 1;10, and left only one or two

instances in the remaining sessions. The long lag stops increased from 0% at 1;7 to

about 60% at 2;0, and dropped to 20% between 2;0 and 2;4. When the child turned 2;4,

he first showed aspiration contrast at the dental place of articulation. At 2;6 the child

also showed aspiration contrast at the labial and velar places of articulation. In addition

he produced nearly 71% of the tokens in the long lag range at 2;6 which increased from

20% when he was 2;4. Though the child has acquired aspiration contrast, his VOT

(4)

values greatly overlapped between homorganic stops and did not yet resemble adults’

values.

Clumeck et al. provided a thorough observation on the topic of the acquisition of

aspiration contrast. However the study seems to have some limitations. First the

elicitation process of children’s speech token was different between from that of adults’.

Children’s token were elicited from spontaneous speech while adults’ were the reading

of isolated word lists. It is observed that in continuous speech, sounds could be

influenced by neighboring phonetic contexts (Cho and Ladefoged1999, Davis 1995,

Smith and Kenney 1999). Thus the aspiration length of children’s stop consonants may

be influenced to some extent. Furthermore the authors noticed that though the children

all acquired aspiration, their VOT values were not adult like. In spite of the evident

distinction between adults’ and children’s VOT values, the authors did not provide any

explanation concerning the distinction. There may be a room worthy of discussion.

2.1.2 Thai

Gandour et al. (1986) investigated three different age groups of children to

observe their acquisition process of word-initial stop consonants. The authors

compared children’s mean VOT values with adults’ and found that one of the three

groups of children, i.e. seven-year-old, had almost completed the acquisition of the

(5)

voicing contrast while the other two age groups, i.e. three-year-old and five-year-old,

had not acquired the voicing contrast of stops at the places of labial and alveolar.

The stop consonants in Thai are produced at the following three places of

articulation, labial, alveolar and velar. Both labial and alveolar stops are characterized

by voicing and aspiration but velar stops are distinguished solely by aspiration. Thus

labial and alveolar stops each contain three phonemes, i.e. voiced and unaspirated /b/

and /d/, voiceless and unaspirated /p/ and /t/ as well as voiceless and aspirated /p

h

/ and

/th/. But there are only two stops pronounced at the velar place of articulation:

voiceless and unaspirated /k/ and voiceless and aspirated /k

h

/. These eight sounds are

the targets for the investigation of acquisition of voicing contrast in this experiment.

To obtain children’s production of stop consonants for analysis, Gandour et al.

chose a total of eight words each beginning with the above target stops to serve as the

speech stimuli. There were three words which contrasted as a near-minimal triplet for

labial and alveolar stops; and two words which provided a minimal pair for velar stops.

In addition falling tone was chosen for the near-minimal triplet for labials, mid tone for

alveolars, and low tone for the minimal pair for velars.

Gandour et al. obtained children subjects’ speech from their interaction with the

experimenter while adult subjects’ speech materials were recorded when they were

instructed to read the target words on index cards. Subjects’ production was

(6)

subsequently analyzed by a sound spectrograph to calculate the VOT values of the

word-initial stops. The VOT values were further analyzed statistically.

The results showed that adults pronounced the eight word-initial stops with

distinct and non-overlapping VOT values. The same result could also be observed in

the seven-year-old children’s speech production though there were occasional

overlapping VOT across three places of articulation. Gandour et al. pointed out that the

overlapping phenomena were more obvious in the production of three- and

five-year-old children, especially between the VOT values of voiced and the voiceless

unaspirated stops. The three-year-olds have generally acquired the voicing contrast of

stops except the distinction between /b/ and /p/ and between /d/ and /t/. For the

five-year-olds, they can pronounce all stops contrastively excluding the non-adult like

/b/ and /d/.

Gandour et al. adopted a quantitative approach to investigate the development of

the voicing contrast in Thai. However they only have seven subjects in each age group

which is not adequate to fulfill the minimum number of subjects required by a

quantitative analysis. Moreover Gandour et al. didn’t control the way the subjects

produced the speech stimuli. Children’s speech data were collected from spontaneous

interaction with the experimenter but adults’ data were from the reading of the index

cards. Since spontaneous speech may often subject to the influence of surrounding

(7)

sounds and various strategies in interaction, the pronunciation of the words tends to be

affected to some extent. On the contrary reading words in isolation may render clearer

pronunciation (David 1995, Cho and Ladefoged 1999). If the subjects’ data were

recorded under the same conditions, there may be some different results of VOT values.

The authors pay much attention to the comparison of mean values of VOT without

some comment on the distribution of VOT across three places of articulation and across

different age groups. The distribution of VOT seems to change with age and place and

that might need a word or two from the authors.

2.1.3 English

Macken and Barton (1979) reported a longitudinal study of the acquisition of

voicing in word-initial stop consonants in English. They recorded the utterances of four

English-speaking children since they were aged 1;4 to 1;7. After eight months’

observation, they categorized three sequential stages to the acquisition of voicing: I. the

children showed no voicing contrast; II. the children contrasted voicing between

cognate stops but their VOT values predominantly fell in short lag region; III. the

children produced voicing contrast which was similar to adults’ VOT values.

The authors collected the speech data from four monolingual children who

were recorded at two-week intervals during an eight-month period when playing with

picture books, puzzles and toys. The data were then analyzed to obtain VOT values and

(8)

the mean VOT values were put to statistical tests of significance. If the means for the

voiced and voiceless homorganic pairs were statistically significant, the children were

said to have acquired voicing contrast. In addition to charting children’s acquisition

process, Macken and Barton compared children’s VOT values to adults’ values which

were reported in several previous studies.

The comparison between children’s VOT values and adults’ revealed a general

three-stage development of children’s acquisition of voicing. In stage I, most of

children’s tokens for both voiced and voiceless stops fell within the short lag range

(VOT < 30 ms.) and occasionally within the voicing lead range (VOT < 0 ms.). The

mean VOT values between the cognate stops were not significantly differently from

each other. This type of data was also commonly found in Cantonese-, English- and

Spanish-speaking children before they were two years old (Clumeck et al. 1981, Eilers

et al. 1984, Macken and Barton. 1980). In stage II, the mean VOT values for the

cognates showed a contrast which often emerged after children turned two years old

(Clumeck et al. 1981, Davis 1995, Eilers et al. 1984, Macken et al. 1980, Snow 1997,

Xu 1990). However the contrast was not adult-like in that the VOT values mainly fell in

adults’ perceptual boundary for the voiced stops only. Therefore it was difficult for

adult speakers to tell whether the child had distinguished voicing or not for both voiced

and voiceless stops sounded like voiced sounds to them. Children’s VOT values in

(9)

stage III better resembled adult values than in stage II but not adult-like yet. The VOT

means for voiced stops usually fell within short lag region and the means for voiceless

stops in long lag region. However the VOT values of the homorganic pair of stops often

overlapped with each other. In addition VOT means for voiceless sounds were often

more variable and longer than adults’. Thus children’s means did not resemble adult’s

even though later the wide-spread voiceless VOT would gradually be shorten back and

tokens of extreme VOT values would also be reduced (Gandour et al. 1986, Koenig

2001, Macken et al. 1980). Besides, the authors observed that the transition from stage

II to III was abrupt and that individual differences were also involved in the age of

transition.

Macken and Barton offered a three-stage developmental sequence of the

acquisition of voicing contrast with elaborated details of VOT variations in each stage.

However the comparisons between children and adults could only be conducted

through visual inspection of the VOT values but not through empirical analyses or

inferential statistics because adults’ VOT values were reported in other previous

investigations. On the contrary Davis (1995) statistically evaluated the VOT values of

English velars between four groups of children aged two to six and adults. The ANOVA

indicated that there was no significant effect of age × consonant type. Such statistical

(10)

results may imply more specifically than visual inspection that after voicing contrast is

acquired, children’s mean VOT values are not significantly different from adults’.

Moreover it was reported that children’s VOT variations in stage III seemed not as

stable as adults’ yet due to more variable distribution and overlapping between

homorganic pairs. The authors presumed that VOT duration would become more

consistent later but they did not specify the time at which their assumption would be

realized. The investigation may continue to observe when and how children regulate

the VOT duration as they grow older.

2.1.4 Summary of cross-linguistic studies

The above cross-linguistic studies, either longitudinal or cross-sectional, aimed to

find out the age of acquisition and the order of acquisition in terms of place of

articulation with the mean VOT values between cognate stops. After resolving the two

questions, the researchers left open the phenomenon that older children did not produce

the stop consonants in an adult-like model. Even though children aged three to seven

were investigated in some studies, their speech production were examined to support

that voicing contrast had been acquired at younger age.

Some other experiments have found that the VOT values of stop phonemes uttered

by older children may continue fluctuation sometimes with statistical significance

(Hewlett and Waters 2004, Koenig 2001, Smith and Kenney 1999, Snow 1997,

(11)

Studdert-Kennedy and Goodell 1995). So the temporal features of stop consonants

could be still in progress after the child showed contrast. Along this line, the results

revealed from the above cross-linguistic studies may be a record which reports a point

of time rather than the onset and completion in the development of aspiration contrast.

As revealed in Macken and Barton’s (1979) three stages in voicing acquisition, children

in stage II have acquired the covert contrast but their VOT values in stage III did not

resemble adults’ values. Though phonological contrast was completed earlier in stage II,

phonetic skills in producing VOT seem still developing in stage III (Scobbie et al.

2000). Perhaps the covert contrast may be the result of extremely long VOT values

which appeared to influence the statistical comparison between homorganic plosives.

Since the phonetic values continue changing, the display of unperceivable voicing

contrast around age two was simply too immature to be deemed the end of acquisition.

After all the age at which children show phonemic contrast in statistical analysis might

not necessarily correspond to the point of time that they master appropriate

pronunciation of the target sounds of the language. It is therefore assumed that the

follow-up investigation may provide some clues to the time-lag problem so as to draw a

fuller picture of aspiration or voicing development.

(12)

2.2 Gradient development in acquisition

According to the cross-linguistic longitudinal studies reviewed above, children’s

pronunciation development gradually approach adult form over time in which there are

noticeable gradient changes (Hewlett and Waters 2004, Smith and Kenney 1999,

Scobbie et al. 2000). For example, Clumeck and his colleagues (1981) described the

progress of their Cantonese-speaking subject in reducing voicing lead instances from

40% of all tokens to 2.4% during a period of about three months. Furthermore they also

reported that the child’s speech data contained none long lag stops in the beginning of

the study which then increased to 60% and dropped to 20% in a seven-month period.

Such VOT variation exemplified that gradient change in speech development is not

absolutely smooth. Rather the child’s VOT values in the production of stop consonants

may remain constant for some time, be reduced or undergo various changes. These

various phonetic values seem to indicate the child’s attempts to produce target sounds

in adult norm. Thus VOT variation could be evidence that children are still learning to

master phonetic skills which have phonological status in their languages.

Though the gradient change in VOT values could occur before a child acquired

aspiration contrast, it is presumed to observe VOT fluctuation even after acquisition

since some previous literature reported that children’s productions did not quite

resemble adults’ (Clumeck et al. 1981, Gandour et al. 1986, Macken and Barton 1979,

(13)

1980). The phenomenon of VOT variation appears not to receive further attention once

the child produced contrastive stops. If the investigation on the acquisition of voicing is

completed as soon as the discrepancy of mean VOT for cognate stops is statistically

significant, then the research could merely provide fragmentary results along the

developmental continuum of phonological voicing contrast (Hewlett and Waters 2004,

Koenig 2001, Scobbie et al. 2000, Smith and Kenney 1999). In spite of longitudinal

research, the time span of investigations could sometimes be short-term, i.e. less than

one year, so that children’s peculiar speech in relation to VOT duration and variability

was usually left untouched. It is assumed that identical pattern of VOT variation would

be found at older ages if the longitudinal study can be conducted for a longer time

period. With the continuous fluctuation in VOT, a child’s statistically contrastive

phones do not literally mean that they have mastered the pronunciation required by the

speech community. Figure 1 (adapted from the model in Scobbie et al. 2000) displays

the developmental model for a child to master adult language. As shown in Figure 1,

young learners first do not distinguish two sound tokens. When a child acquires the

aspiration contrast between aspirated and unaspirated stops, their VOT value would

match that of adults’. After the contrast has been attained, the child’s VOT values for

the homorganic pairs would go on fluctuating for a period of time before stabilized, as

indicated by “immature contrast”. The younger the child is the greater degree the

(14)

VOT diverges from the adult norm. The maximum differentiation of one token from

the other is followed by a period of gradual approximation to the adult target, specified

as “mature contrast”. Though children’s output does not entirely resemble adults form,

their production should be very close to adults’ and as a result the aspiration contrast

in children’s speech is categorized as mature.

Figure 1 A possible model of the acquisition and development of aspiration contrast.

To collect data from older children, Smith and Kenney (1999) investigated four

children for periods ranging from four to six years on the development of temporal

properties of speech production in English. Their results showed that temporal duration

and variability tend to consistently reduce as age grows. The greatest decrease occurred

VOT values

Adult target value

Mature contrast Immature

contrast contrast

time

child value

(15)

between 18 months and 3-3.5 years of age and then increase again between four and six

years old.

The authors’ investigation primarily concerned general temporal development that

might occur in children’s speech as they grew older. The target words were two-syllable

utterances containing vowels alternating with singleton, voiceless stops or

voiceless/voiced fricatives which were elicited through a picture-naming procedure. As

for the four subjects, three of them were 18-20 months and one of them was four years

old at the beginning of the study. All of the four children were recorded at

approximately seven- to ten-month intervals for four to six years of time span.

Regarding the temporal measure of VOT values for word-initial voiceless /p/,

Smith and Kenney found that the VOT changes in the first six sessions of the three

younger subjects was not statistically significant, but there are similar pattern

demonstrated among them. The three children were initially observed to develop long

lag /p/ between 2;2 and 3;0 for N, between 1;7 and 2;2 for S, and between 1;7 and 3;0

for A. This time period was generally comparable to the ages at which voicing contrast

appeared in previous literature (Clumeck et al. 1981, Davis 1995, Gandour et al. 1986,

Macken and Barton 1979, 1980). These previous studies thus claimed that voicing

contrast was attained primarily by lengthening the VOT values of the voiceless

aspirated stops so as to separate the respective values for cognate stops. Afterwards

(16)

each of the three children showed somewhat noticeable VOT variation at

approximately three and five years of age. They produced shorter VOT within 35 to 45

milliseconds at 3;8 (A), 4;6 (N), and between 3;0 and 3;8 (S) than the values they

produced at older age and than the averages for two control adult subjects. Later, A

produced longer VOT at 4;6, N and S at 5;2 by raising at least 15 and at most 25

milliseconds of duration. Luinge and her colleagues (2006) also found children in the

third (36-47 months) and fifth (60-72 months) age groups were less scalable when they

attempted to construct a scale of the ordering of milestones in language development

for children. The possible reasons for the more peculiar speech behavior of the two age

groups can be that children attain language acquisition in the first years of their lives

and then carry on the task of fine-tuning in the following years. As the tuning process

tends to regulate more detailed articulatory gestures, it is better observed with some

measurement. With respect to the development of aspiration, VOT can be the most

appropriate tool for the measurement.

Smith and Kenney provided detailed information about the temporal development

of a small number of children across four to six years of observation. As the results

showed, children’s VOT of the voiceless stops collectively displayed a tendency of

decrease in duration but seemed to manifest identical pattern of changes and

modifications across the first six sessions. The general picture was that the children

(17)

tended to extend the VOT values for the first time around 1;7 to 3;0, then shortened the

values between 3;0 to 4;6, and again increased the values at 4;6 to 5;2. The variation

illustrated that VOT development involved a progress of gradient changes rather than

smooth decrease on the way to approach adult values.

Despite the comprehensive documentation, there was only one phoneme /p/

discussed in the study. Whether the developmental pattern could be applicable to other

such stop phonemes as /b/, /t/, /d/, /k/ and /g/ may be worth further researching.

Moreover it is common to hear that the results of longitudinal study would be

susceptible to individual differences due to small subject pool. Such potential

inadequacy could often be supplemented by group-oriented study as reviewed in

section 2.1. Since cross-sectional research is group-oriented in nature, it is prerequisite

for statistical analyses that the minimum number of subjects in a group is usually fifteen

so as to furnish a representative description. However it is found that there is a regular

shortage of the requirement in the papers in section 2.1 (Clumeck et al. 1981, Davis

1995, Gandour et al. 1986). Besides in some of the previous investigations, the age

interval between successive groups could be various or as great as over three years

(Davis 1995, Gandour et al. 1986) so that the results could be limited to provide overall

group trends across several years of development but were not able to address what

might happen during the skipped years between one age-group and the next.

(18)

The present study thus adopted a cross-sectional perspective with age groups with

only one-year interval to trace more closely the general laws of VOT variability in later

development after aspiration contrast has been acquired.

數據

Figure 1 A possible model of the acquisition and development of aspiration contrast.

參考文獻

相關文件

Enabling occupation in children: The cognitive orientation to daily occupational performance (CO-OP) approach. Enhancing transition from early childhood phase to primary

Based on the suggestions collected from the Principal Questionnaire and this questionnaire, feedback collected from various stakeholders through meetings and

The long-term solution may be to have adequate training for local teachers, however, before an adequate number of local teachers are trained it is expedient to recruit large numbers

The left panel shows boxplots showing the 100 posterior predictive p values (PPP-values) for each observed raw score across the 100 simulated data sets generated from

H..  In contrast to the two traditional mechanisms which all involve evanescent waves, this mechanism employs propagating waves.  This mechanism features high transmission and

Let T ⇤ be the temperature at which the GWs are produced from the cosmological phase transition. Without significant reheating, this temperature can be approximated by the

• LQCD calculation of the neutron EDM for 2+1 flavors ,→ simulation at various pion masses &amp; lattice volumes. ,→ working with an imaginary θ [th’y assumed to be analytic at θ

Most experimental reference values are collected from the NIST database, 1 while other publications 2-13 are adopted for the molecules marked..