Finally we will illuminate how voice onset time can be utilized to provide a supplementary point of view to the traditional practice of the inquiry into language acquisition

(1)

Chapter One Introduction

This study investigated the acquisition and development of aspiration contrast in

Mandarin Chinese from one- to six- year-old children. The one-year-olds were

observed longitudinally for a five-month period while the two- to six-year-olds were

examined in a cross-sectional study. Children’s speech samples collected from the task

of picture naming were analyzed instrumentally and then compared with adults’ speech

production in order to sketch a general tendency to the development of aspiration. After

the developmental trend was drawn, it was related to similar research topic in other

languages for a cross-linguistic comparison which is assumed to present a common

progress with respect to the acquisition of aspiration.

As mentioned above, instrumental analysis was employed to explore the speech

data in the present experiment. In fact it was voice onset time (VOT) that was selected

among various acoustic components to measure the aspiration duration of stop

consonants. Though the present study of language acquisition obtained and explained

the results by means of instrumental practice, there has been a long history in inquiring

into child phonological progress by applying generative phonological rules to child

speech development. To explicate the motive of opting for experimental research

rather than phonological derivation in this study, we would like to make a remark about

the one-to-multiple correspondence relationship between phonological contrast and

(2)

phonetic values, and the inadequacy of applying adult-based phonological theory to

child language acquisition. Finally we will illuminate how voice onset time can be

utilized to provide a supplementary point of view to the traditional practice of the

inquiry into language acquisition.

1.1 The correlation between phonological contrast and phonetic values

The phonological feature of aspiration or voicing is primarily used to characterize

stop consonants in the languages of the world, but the phonetic realization of the feature

may be different from language to language (Allen 1985, Cho and Ladefoged 1999,

Goodluck 1991, Johnson 2003, Ladefoged and Maddieson 1996, Ladefoged 2001,

Menn 1983, Macken 1980, Öğüt et al. 2006, Smith and Kenney 1999, Snow 1997, Wei

1997). Thus it is expected to cause some confusion if cross-linguistic reference only

counts on the orthographical label. To solve the problem, more and more studies rely on

phonetic values furnished by instrumental analyses in their research. As the

relationships between phonological terms and their corresponding phonetic output are

usually variable among languages, instrument analyses could act as an impartial tool

regardless of conventional feature classifications in different languages.

A pair of stop consonants at the same place of articulation is traditionally

differentiated from each other by the feature of aspiration or voicing aside from other

acoustic parameters. At the phonological level, voiced unaspirated stops are

(3)

traditionally represented as /b/, /d/, and /g/ as in Thai and Turkish while their voiceless

counterparts as /p/, /t/, and /k/ as in English and French. Furthermore voiceless

aspirated stop phonemes may be labeled as /p^h/, /t^h/, and /k^h/ as in Cantonese and

Apache. However at the phonetic level, the same orthographical representation in

different languages is not always realized in the same way. For example, with regard to

voicing contrast, Öğüt et al. (2006) and Gandour et al. (1986) reported that /b/, /d/, and

/g/ in Turkish and Thai respectively are classified as voiced stop consonant according to

the manner of articulation. Both of the two studies revealed that these three stops are

pronounced as voiced by adult native speakers. On the contrary, the same three symbols

in English are produced usually as voiceless unaspirated as are Turkish and Thai

voiceless stops (Macken1980, Wei 1997). The cross-linguistic results thus indicate that

there may be some discrepancy among languages of the world in phonetically realizing

the same phonemic labels.

Voicing contrast seems not to have identical phonetic production across languages,

nor does aspiration contrast. Cho and Ladefoged (1999) found that both voiceless

aspirated and unaspirated sounds may be realized with extensive degrees of aspirated

duration among the eighteen languages they investigated. Take velar plosives /k^h/ and

/k/ for example, the pair of aspirated and unaspirated stops in Apache has a mean VOT

of 80 and 31 ms. respectively; in Navajo is 154 and 45 ms; and in Tlingit 128 and 28 ms.

(4)

Both the realization of aspirated and unaspirated stop in the three languages is highly

different from one another. Such a great divergence renders the category of aspiration

perplexing, i.e. to what extent a stop phoneme can be identified as aspirated or

unaspirated. In light of these cross-linguistic phonetic variation, Cho and Ladefoged

further classified the category of aspiration into four such groups as unaspirated,

slightly aspirated, aspirated and highly aspirated stops on the basis of VOT values.

Even though four types of aspiration was proposed, a language which has aspiration

contrast normally characterizes the stop consonants as aspirated or unaspirated

irrespective of which type the sound belongs to. Therefore the phonological

terminology of aspiration for a stop phoneme could mean comparatively vague and

ambiguous when conducting a research across languages.

With the language-specific relationship between the designation at the

phonological level and the realization at the phonetic level, it is predictable to generate

some misunderstanding if stop consonants are described either in terms of the

phonemic labels such as /b/, and /t/ or contrastive categories such as voiced or aspirated.

Thus VOT is normally acknowledged as a language-independent means of

characterizing stops.

(5)

1.2 Child phonology

In the research of speech production development, a child’s speech data has been

considered as the output derived from phonological rules (Fromkin and Rodman 1997,

O’Grady et al. 1997, Smith 1973). The derivation process normally adopts broad

transcription to represent a child’s pronunciation. However broad transcription would

interpret the changes from a child’s underlying representation to their surface

production as an abrupt categorical change rather than progressive gradient change.

Such change from one phoneme category to another might somehow misrepresent the

phonetic facts of speech development (Hewlett and Waters 2004, Menn and

Stoel-Gammon 1995, Scobbie et al. 2000, Stewart and Vaillette 2001). Therefore some

literature on the progress of child speech production has been employing instrumental

analyses as an aid to investigate child’s phonological development.

In generative phonology every surface output of the production of sounds is

derived from an underlying form through some phonological process of substitution,

deletion or insertion of a segment and so on. It has been presumed that a child also

obeys the same process to produce their words from an underlying form which was

constructed as a result of their comprehension of adult speech even though their speech

is not entirely in an adult form. This notion indicates that children’s ability to perceive

phonological contrasts is better than the ability to produce them. O’Grady et al. (1997)

(6)

illustrated that a child can not pronounce cart and card or jug and duck distinctively but

can point them out correctly in a comprehension task. Under the derivational approach

based on adult system, if a child pronounces tent as [det], the process would be involved

with voicing and nasal deletion, yielding a categorical change at the surface level.

However the broad transcription used to represent the example above may overlook

some phonetic facts of the surface form. As children’s pronunciation tends to be more

variable, the chances are that they could produce two phonemes /t/ and /d/ similarly to a

listener’s ears but differently in instrumental analyses.

The subtle distinction between a child’s productions of two phonemes can be

viewed as covert contrast which might not be adequately accounted for by phonological

process. Cover contrasts cannot be explicitly described by phonemic transcription

because each phonemic label represents a discrete unit with no intervening area

between them. Nevertheless a child’s immature speech is often intermediate between

sounds as in Smith’s findings (1973) that his young subject produced some [ts] sound

when he seemed to initially develop [s] out of [t]. It is plausible that before an adult can

distinguish a child’s two target phonemes, the child has already produced a covert

contrast between them. In the beginning the phonetic values of the covert contrast are

normally too minute to be discerned within the range of human perception. Not until the

contrasts are large enough to affect categorical perception can an adult detect the

(7)

differences between sounds. So it is helpful to apply instrumental analysis to the study

of child language development as instruments can be tuned to measure fine components

of speech sounds.

1.3 Voice onset time (VOT)

Among stop system one of the most distinguished acoustic values is the burst

shown on the spectrogram. (Borden et al. 1994, Clumeck et al. 1981, Davis 1995, Eilers

et al. 1984, Gandour et al. 1986, Johnson 2003, Ladefoged 2001, Ladefoged and

Maddieson. 1996, Macken and Barton 1979, 1980). The temporal duration between the

release burst of the stop and the onset of the following voiced sound is called voice

onset time (VOT). VOT can also be influenced by place of articulation, i.e. the further

back the place of articulation is the longer VOT is. Lisker and Abramson first identified

three categories of VOT commonly observed in many human languages (cited from

Clumeck et al. 1981, Davis 1995, Macken et al. 1979, 1980). The first one is called

voicing lead in which voicing is detected before the release of the stop. Thai, for

example, has such voiced stop consonant as /d/ in /daaw/ “star”. The second category is

short lag VOT or voiceless unaspirated in which voicing starts usually within 30

milliseconds (ms.) after the release of the stop, such as the /t/ in Spanish. The last one is

long lag in which voicing starts about 30 ms. after the stop is released, as in English, e.g.

/p/ as in pen, /t/ as in tea, and /k/ as in cool. According to the above categorization of

(8)

VOT, the unaspirated stops in Mandarin such as /p/, /t/ and /k/ are assumed to

correspond to the type of short lag. Presumably their aspirated counterparts /p^h/, /t^h/,

and /k^h/ correspond to long lag.

The distinction of aspiration in Mandarin Chinese is used to discriminate

syllable-initial stop consonants (Cheng 1973, Li and Thompson 1981, Wei 1997, Xu

1990). The distinction between aspirated and unaspirated stops can be illustrated by the

following minimal pairs (Cheng 1973:35-36): ba /pa/ “father” vs. pa /p^ha/ “to fear”; da

/ta/ “big” vs. ta /t^ha/ “to step”; gu /ku/ “old” vs. ku /k^hu/ “bitter”.

As revealed by the above examples aspiration is one of the major distinctive

features which serve to contrast three pairs of cognate stops in three different places of

articulation. It distinguishes the two labial sounds /p/ from /p^h/, the two dentals /t/ from

/t^h/, and two velars /k/ from /k^h/. In some languages such as English aspiration is also

designated as voicing.

This study aims to investigate Mandarin-speaking children’s acquisition and

development of word-initial stops with the application of VOT. The objective of this

experimental study is to figure out the possible developmental stages in the acquisition

of the stops at different places of articulation and of different values of aspiration.

Furthermore the results of acquisition can be compared and contrasted with other

previous findings in other languages to draw a common developmental tendency. The

(9)

following chapter two reviews the literature whose research topic is on the acquisition

of stop consonants. Chapter three outlined the methodology of the longitudinal and

cross-sectional studies in this experiment and the administration of a pretest. Chapter

four and five presents and discuss the results respectively.