Chapter Five Discussion
This chapter discusses the findings in the longitudinal and the cross-sectional
study separately. Section 5.1 addresses the speculated reasons for the sequence of
acquisition. In section 5.2 the results are related to those in previous literature. The
speculated reasons for these findings were also discussed.
5.1 Longitudinal study
This section first relates the VOT variation of the two young children to previous
studies concerning the chronological order of acquisition with regard to age, aspiration,
and place of articulation. Then the possible explanations for VOT changes across the
five-month long observation were also discussed 5.1.1 Sequential order of acquisition
The results of the longitudinal study generally confirmed that the age range at
which children produce aspiration contrastive stop consonants across three places of
articulation is around 1;4 to 2;8 (Clumeck et al. 1981, Jakobson 1968, Davis 1995,
Eilers et al. 1984, Gandour et al. 1986, Macken and barton 1979, 1980, Smith and
Kenney 1999, Xu 1990). One of our subject, Hsiao, produced contrastive homorganic
cognates at the labial and dental place of articulation when she was about 1;8 and 1;11
respectively. What is left for her to distinguish aspiration between the pair of stop
consonants is the ones produced at the velar place of articulation. On the other hand, the
across labial, dental, and velar places of articulation when she was last seen at 1;11. The
learning speed revealed from the two girls’ speech data is as greatly subject to
idiosyncratic individual factors as shown in previous literature on longitudinal
research.
As for the chronological order of acquisition in terms of aspiration, it is found that
in the beginning of the recording, both Hsiao and Lin had the tendency to produce both
aspirated and unaspirated stops predominantly within shorter VOT duration, i.e.
voiceless unaspirated sounds. It has been established repeatedly that voiceless
unaspirated sounds seem to be the universally first acquired phoneme regardless of the
phonological opposition a language possesses to separate a voiceless unaspirated stop
from a voiceless aspirated (as in Cantonese and Thai), or from a voiced unaspirated stop
(as in Hindi and Spanish) (Clumeck et al. 1981, Eilers et al. 1984, Gandour et al. 1986,
Grigos et al. 2005, Hua and Dodd 2000, Jakobson 1968, Macken 1980, Macken and
Barton 1979, 1980, Menn 1983, Menn and Stoel-Gammon 1995, Smith and Kenney
1999, Wong and Stokes 2001). It also appeared that the targets with long-lag VOT
developed somewhat later than the short-lag sounds. In addition there was also other
evidence that may suggest both Hsiao and Lin had a better productive control on
unaspirated stops. They both were observed to substitute an unaspirated stop for an
aspirated but not vice versa. For instance Hsiao, aged 1;7:11, produced /p
h/ in /p
hin t
hu/
“puzzles” with approximately 4 ms. of VOT. At 1;10:23, Lin pronounced the initial /p
h/
in /p
hai ou/ “to clap hands” like /p/ whose VOT was about 18 ms. The occurrences of
the substitution became fewer as they gradually had a better command of aspiration.
The acquisition sequence of the place of articulation has been often reported that
either labials or dentals would be first distinguished by the child while it is less noted
that velars would be acquired earlier than labials or dentals (Clumeck et al. 1981, Eilers
et al. 1984, Fromkin and Rodman 1997, Gandour et al. 1986, Jakobson 1968, Macken
1980, Macken and Barton 1979, O’Grady et al. 1997, Wong and Stokes 2001). In
present experiment, Hsiao initially demonstrated aspiration contrast at the labial place
of articulation followed by the dental. Both Hsiao and Lin were also noticed to
sometimes either substituted labial or dental stops for velars, or simply skipped the
word-initial velars and jumped to the next vocalic segment in some tokens. For
example Hsiao uttered /ku tu/ “a princess” as /tu tu/, replacing /k/ with /t/ when she
was 1;7:11. In the same time the token, /kua niou/ “a snail”, was produced as /ua niou/,
missing the initial /k/. Lin also uttered /t/ in place of the word-initial /k
h/ in /k
hu/ “to
cry” at 1;11:7. These data generally conform to the findings that velars are generally to
be learned at a later time. However some studies described that velars may sometimes
replace labials or dentals in Mandarin-leaning children’s speech. It is speculated that
the present experiment did not included such speech output may be a result of
elicitation method during a short-term investigation.
5.1.2 The correlation between VOT duration and the larynx
In the beginning two or three session Hsiao’s and Lin’s target words were very
variable both in the respect of place and manner of articulation, yielding many
unintelligible tokens. As they grow older, they speech became more comprehensible
due to better control of aspiration and more proper point of constriction in the oral
cavity for the three pairs of stops.
Since aspiration feature is mostly controlled by the activities of the larynx, the
relationships between aspiration and the larynx are discussed in this section to explain
the possible reasons fro the earlier development of unaspirated stops. It is often
acknowledged that the state of the larynx is responsible for aspiration (Blevins 2004,
Borden et al. 1994, Grigos et al. 2005, Johnson 2003, Menn 1983, Ladefoged 2001,
Ladefoged and Maddieson 1996, Stokes et al. 2002, Whiteside et al. 2003), a child has
to learn to control of the timing of glottal opening before they can produce acceptable
VOT values for the aspirated and unaspirated cognates. In the production of a stop
which is characterized by aspiration, the vocal folds are apart for a period of time,
allowing subglottal air to flow out without generating vibration. But when the two
subjects in this experiment articulated a token word beginning with an unaspirated stop
like /pa/ “eight”, open glottis sometimes is hard to maintain because the subsequent
vowel requires the vocal folds to vibrate. Therefore the occurrence of prevoiced and
short-lag tokens would be often found in their speech data. Later Hsiao and Lin might
somehow learn to better coordinate the timing of laryngeal activities or aspiration and
of vibration so that prevoicing tokens gradually disappear to some extent. It is believed
that when the glottal status for the production of short-lag VOT was better controlled,
the two children then attempted to adjust the laryngeal movement to create long-lag
VOTs. Though the above description of the growth and decline of voicing lead and
lag VOT values seems to suggest a chronological order, the order is assumed to be
relative not absolute, nor is the occurrence of any one type of VOT exclusive of the
other two.
5.1.3 The correlation between VOT duration and the articulation
In the above section the acquisition of aspiration contrast was explored through
the influence of the larynx on the VOT values. This section would try to provide some
explanations for the earlier acquired contrast between the two pairs of front stops than
velars.
The point of constriction for the production of the three pairs of oral stops is
formed at three different places in the vocal tract (Blevins 2004, Borden 1994, Johnson
2003, Ladefoged 2001, Ladefoged and Maddieson 1996). Uttering labial stops is
involved with the upper and lower lips joining together, sealing and then releasing the
air pressure behind them. For dental stops the air is compressed and released at the
point where the tip of the tongue meets the area behind the teeth. To make a velar stop
the air pressure is formed and blocked at the region where the body of the tongue
touches the soft palate. It is widely acknowledged that both lips and tongue tip are
considered highly agile articulators. The comparatively mobile lips and tongue tip may
be easier for children to manipulate, thus distinguishing aspiration contrast at labial
and dental places of articulation at earlier time. On the other hand it is considered
more difficult for children to deal with the articulatory activities for uttering velar
stops in that the body of the tongue is less flexible than the tip of the tongue.
Furthermore the air compressed at the front part of the oral cavity could immediately
flow out after it is released. But the point of constriction for velars is located at the
velum so that the air may need more time to travel through the vocal tract, producing
longer and more variable VOT. Along this line it is more possible that labials and
dentals may be developed in advance of velars.
In sum one-year-old children may have acquired aspiration contrast between
cognate stops, as in Hsiao who has distinguished labials and dentals. However there
exits individual difference on this issue such as the case of Lin who has not acquired
any contrast among the three pairs of stops. In spite of the difference in age of
acquisition, both of them seem to produce short lag VOT earlier than long lag due to
the coordination of laryngeal gestures. Besides VOT for velars appears to have the
most inexplicable patterning and this may be attributed to the relatively immobile
articulators.
It is acknowledged that other potential factors can also contribute to the results
such as the unrepresentative findings from a small number of subjects, the learning
effect from multiple exposures to identical experiment, and the problem of word
frequency. It is noted that there may be some effect on different elicitors. The elicitor
in the longitudinal study was the subjects’ mothers while the experimenter was
responsible for speech elicitation in the cross-sectional study and adult group. It is
observed that the subjects’ mothers tended to repeat and exaggerate their
pronunciation but that was rarely found in experimenter’s practice. Therefore the
different exercise of speech elicitation may also have some influence on the results.
5.2 Cross-sectional study
In the cross-sectional study, the results revealed that every group of children has
mastered aspiration contrast between cognate stops. The mean VOT values in each age
group were not significantly different from those of adults’ (Davis 1995, Eilers et al.
1984, Smith and Kenney 1999). However there were significant differences of mean
VOT mostly between the groups of age three and five, and among the three aspirated
stops and unaspirated /k/.
As described above even though aspiration contrast has been acquired, the
variability on the VOT continuum can be observed from the proportion of overlapping
between cognate stops and the range of the VOT as shown in Figures 8.1~8.6, 9.1~9.6,
and 10.1~10.6. It is found that younger subjects’ VOT values of aspirated and
unaspirated stops overlap more than the older subjects and adults. They also tend to
produce more instances of extreme VOT values than older subjects. In addition
younger children’s VOT values seem to run over wider range on the time axis. In spite
of the fluctuation in their VOT distribution, when children’s VOT values of the stop
consonants were compared with adults’ values, the productions of aspiration between
the children and adult appeared not to be significantly different from each other (Davis
1995, Eilers et al. 1984, Smith and Kenney 1999) except for the articulation of aspirated
labial /p
h/, dental /t
h/ and velar /k
h/, and unaspirated /k/ where the three-year-olds and
the five-year-olds separately produced the shortest and the longest VOT. Generally
speaking, the gradual change on the variation of VOT values from younger to older
groups seems to provide positive evidence of gradient development in speech
production. The VOT change among the groups could be explicated from the
correlation between VOT values and aspiration and place of articulation.
5.2.1 The correlation between VOT values and the larynx
It is found that children’s VOT values for unaspirated stop consonants were
generally more stable except for /k/ while VOT for aspirated ones was often distinctly
different among child groups Since the physiological bases for controlling VOT for
stop phonemes reside in the laryngeal gesture, the states of the glottis are assumed to be
the major factor responsible for VOT variation (Ladefoged 2001, Ladefoged and
Maddieson 1996, Stokes et al. 2002). For the aspirated stops, the vocal folds are
separated for a longer period in relation to the unaspirated stops. This relatively
protracted duration for the glottal opening may be harder to master for young children
especially when the following vocalic sound requires the vocal folds to vibrate. The
magnitude of contradictory status of the laryngeal gestures in a stop-vowel sequence is
usually larger if the stop is aspirated. This further timing requirement for the laryngeal
coordination also renders aspirated targets to appear at a later time in the longitudinal
study. Thus it is speculated that laryngeal setting may contribute to the ANOVA results
which reported that the mean VOT for all three aspirated stops were significantly
distinct among groups of children while only one of the three unaspirated, i.e. /k/, was
reported to have significant differences among children.
VOT for unaspirated sounds across these ages of children turns out to be more
stable than their aspirated counterparts. It is suggested that since two years of age
children have better grip on unaspirated stop phoneme in that the VOT values of
unaspirated stops was found to be close to the adult norm. In their investigation of
VOT from 18 languages, Cho and Ladefoged (1999) concluded that VOT for
unaspirated bilabials and coronals (including dentals and alveolars) generally ranges
from about 5 to 30 ms. and aspirated around 50 to 120 ms. while VOT values for
unaspirated velars mostly spread from around 20 to 40 ms. and for aspirated from 40
to 160 ms. Mandarin-speaking adults’ mean VOT for /p/ is about 13 ms., for /t/ about
18ms., and for /k/ about 27 ms. and about 62 ms. for /p
h/, 71 ms. for /t
h/, and 82 for
/k
h/. In the case of Mandarin and the 18 languages investigated by Cho and Ladefoged,
the scope for the VOT of unaspirated sounds is quite limited when compared with that
of the aspirated ones. It is assumed that the relatively well-defined VOT range for the
unaspirated stops consequently helps children locate the required VOT values in their
language. On the contrary, the VOT duration for aspirated stops can last from 50 ms.
to 160 ms. Such a wide distribution may be considered more difficult for children to
figure out the target area in the ambient language.
5.2.2 The correlation between VOT values and the articulation
The above section has discussed the possible influence of the larynx on the
generally less stable VOT values for aspirated stop phonemes. This section would
focus on the relationships between VOT and place of articulation.
As mentioned above, VOT values for velar stops seem to fluctuate to a larger
extent than those for labial or dental stops. More specifically, the distinct
discrepancies of velar /k/ among children but not /p/ and /t/ can also be accounted for
by the place of articulation in addition to the laryngeal movement. The point of
constriction for velar stops in the vocal tract is formed between the back of the tongue
and the soft palate. Normally the tip and blade of the tongue employed in the
pronunciation of labials and dentals are thought to be more mobile than the back.
Maybe it is the degree of the mobility of the active articulators that prompts relatively
unstable production of velars in spite of the fact that there was aspiration contrast
between the two velars stop in each age group. Besides, the point of constriction can
also be responsible for the more variable VOT values observed in velars. Since velars
are produced at the further back location of the vocal tract, the air will flow through a
longer passage to be out of the oral cavity after released from a stop closure. The
longer way to the opening of the mouth may be associated with the normally longer
VOT for velars. In addition more strength would be required to push the air to travel
along the oral cavity. It is thus assumed that the muscle strength used to expel the air
out of the mouth from the velar place of articulation could be more difficult to learn to
deal with. Therefore the fact that VOT for velars is more variable may be ascribed to
the further point of articulation and the more requirement of strength control.
5.2.3 The correlation between VOT values and age
With respect to the development of aspiration in speech production across each
age group, a comparison of VOT duration among the five children groups was
performed. Results of the comparison indicated that VOT values of /p
h/, /t
h/, /k/ and /k
h/
were significantly different among some groups of children. Figures 11.1~11.3
provides the mean VOT duration for the consonants in question across three places of
articulation in each age group. As shown in Figures 11.1 and 11.2, age three and five
appear to be two opposite ends of the fluctuation of aspiration length of /p
h/ and /t
h/.
Three-year-olds produced the shortest VOT duration while five-year-olds produced the
longest. As revealed in Figure 11.3 five-year-olds had the longest duration when
producing /k/ while four-year-olds had the shortest. On the other hand the shortest
duration of /k
h/ was produced by three-year-olds, and then two- and four-year-olds
while the longest duration of /k
h/ was produced by the six-year-olds. It seems that
children aged three tend to curtail the aspiration length of labial and dental stops and
children aged five or six then prolong aspiration again. In addition the VOT values of
velar stops appear to fluctuate more greatly than labials and dentals among children
aged two, three, four and six.
The general pattern of VOT variation presented in Figures 11.1~11.3 also can be
seen in the results of a longitudinal study conducted by Smith and Kenney (1999) who
investigated the overall temporal aspect of speech production in four children for a
period of four to six years. Two of the four subjects produced the shortest labial VOT
when they were about three years old and another child produced the shortest duration
at around four years old. On the other hand, the longest VOT was produced when three
of the children were between five and six years old.
As reported in the present experiment and in Smith and Kenney’s study, VOT
variation can still be noticeable before six years of age. The remarkable decrease in
mean VOT at about three years of age might be regarded as a type of
over-performance of aspiration. It is therefore conjectured that the behavior of
over-performance could be children’s initial understanding of the characterizations
lying in the production of aspirated and unaspirated stops. Children’s capability in the
realization of aspiration feature is observed to improve with time, as the mean VOT
gradually approaching the adult norm at older groups. Since the ability for temporal
control requires time to mature (Smith and Kenney 1999, Koenig 2001), the VOT
shown in Figures 11.1~11.3 often exceeds adults’ values when children turn five or six
years old. At this point of time the magnitude of children’s prolonged VOT does not
diverge from the adult norm as far as the shortened VOT occurring at younger ages.
This relatively decreased magnitude of divergence may suggest better control on
articulation, as pointed out in Figure 1 that during a time period of immature contrast,
greater variability of VOT is expected to find in younger children and much adult-like
VOT in older children. Since the articulatory gestures of velars seems to be more
difficult to manipulate, the point of time for the VOT of velars to approach adult norm
often seems to delay when compared to the time for labials and dentals. Generally the
VOT of the three pairs of stops appears to demonstrate similar pattern of progression
and the discrepancies between children’s mean VOT and adults’ appear to diminish,
presenting very adult-like values in five- and six-year-olds’ production. However due
to the limited scope of this study, the developmental tendency of aspiration could be
confined to the progress found in pre-school children even though Konig (2000)
pointed out that VOT variation can continue till pre-adolescent years. Moreover
whether the pattern of VOT variation could be applied to other languages also needs
further investigation.
The findings in Smith and Kenney’s longitudinal and the present group-oriented
study demonstrate similar pattern of VOT variation across longer period of
investigation. This pattern of variation may support the idea of gradient change in
which the development in temporal control of the stop consonants may undergo several
stages involving plateaus, reversals or other changes to slowly approximate adult-like
production (Hewlett and Waters 2004, Smith and Kenney 1999). It is suggested that the
VOT variation displayed in the two studies thus exemplified one aspect of the gradient
development of speech production.
Though our findings could be partially proved by a previous study, it is possible
that other factors can have some impact on the results. For example, since this
experiment incorporated subjects from a wide range of age, the set of stimulus tokens
may be more difficult or less familiar to younger subjects, thus resulting in the pattern
of progress. In addition the recording session was conducted once in the
cross-sectional study while subjects of the longitudinal study were recorded many
times. It is speculated that children’s first and only encounter with the experiment may
influence their linguistic behavior to some extent.
In short, the results of the cross-sectional study show that though children have
produced contrastive stop consonants, their VOT for aspirated stops and for velars
still keeps on fluctuating in older ages, especially between three and five years old.
The fluctuation of VOT could be first ascribed to three-year-olds’ beginning
awareness in the different productive skills required in the pronunciation of the
consonants in question. It is also observed that VOT produced by children at four
years of age would further approximate the adult norm. Nevertheless the five- and
six-year-olds would continue to aspirate the front and back stops with the VOT
somehow surpassing the adult value. It is suggested that the overall VOT variation
presents further evidence of gradient development in speech production.
Duration of labials
0 10 20 30 40 50 60 70 80
2 3 4 5 6
age group
duration (ms.) /p/
/ph/
adult /p/
adult /ph/
Figure 11.1 The duration of labials of the child groups.
Duration of dentals
0 10 20 30 40 50 60 70 80 90
2 3 4 5 6
age group
duration (ms.)
/t/
/th/
adult /t/
adult /th/
Figure 11.2 The duration of dentals of the child groups.
Duration of velars
0 10 20 30 40 50 60 70 80 90 100
2 3 4 5 6
age group
duration (ms.) /k/
/kh/
adult /k/
adult /kh/