CHAPTER 1 INTRODUCTION
1.2 Motivation
When Min native speakers learned to speak Mandarin, some linguistic features of
Min would inevitably be carried over to their Mandarin speech. One of the most salient
phonological features being identified at the early stage was the deretroflexion of
retroflex sibilants. There are four retroflex sibilants in Mandarin, including three
voiceless sibilants [tß], [tßH], [ß], and one voiced sibilant [Ω]. Specifically, deretroflexion
depicts the process that voiceless retroflex sibilants [tß], [tßH] and [ß], phones
non-existent in Min, are substituted by voiceless dental sibilants [ts], [tsH] and [s],
phones existent in both Mandarin and Min. The substitution pattern for the voiced retroflex sibilant [Ω], however, varies more, mainly due to the lack of a direct
corresponding voiced non-retroflex sibilant in Mandarin and Min. Some common substituents include [z], [l] and [n] (Chan, 1984).
The realizations of retroflex sibilants, nevertheless, are actually much more
complicated than mere substitution. Deretroflexion is so notorious a feature that in
Chinese education, students in Taiwan are explicitly taught to learn the “standard”
pronunciation. Starting from elementary schools, students are asked to pay extra
attention to retroflex sounds when learning phonetic symbols2. It has always been
highlighted that retroflex sounds are produced with the tongue curled up. Similar
instructions can also be seen in the Mandarin phonetics textbook at college levels.
Textbook demonstration of the “standard” retroflex articulation of [ß] was shown in
Figure 1.2(a), which, in comparison with dental articulation of [s] in Figure 1.2(b), has a
clearly curled-up tongue blade and further retracted place of articulation.
2 The phonetic symbols here refer to Zhuyinfuhao, the sound transcription system officially used in Taiwan. Getting to know these symbols and making use of them are emphasized in the first two years of
Figure 1.2 The production of (a) [ß] and (b) [s] from the Mandarin Phonetics Textbook (NTNU Mandarin phonetics committee, 2003).
Figure 1.3 The X-ray slides of the production of (a) [ß] and (b) [s] from Ladefoged et al. (1984).
However, midsagittal X-ray data provided by Ladefoged et al. (1984) showed that
even for Standard Beijing Mandarin, retroflex sounds are not produced with a curled-up
tongue (see Figure 1.3). Instead, retroflex sibilants in Mandarin, just like dental sibilants,
are produced with the upper surface of the tongue; in other words, the tongue is not
curled up. In effect, the two sets of sibilants differ more crucially in terms of
constriction position and tongue shape. With respect to such an official overcorrection
(a) (b)
(a) (b)
of retroflex pronunciation in Taiwan Mandarin, the realizations of retroflex sibilants
thus drew many researchers’ attention, in terms of how and when Taiwan Mandarin
speakers would use retroflex sibilants and also how the contrast between retroflex and
dental sibilants was made.
Although a great number of studies have been conducted to investigate the
realizations of retroflex sounds in Taiwan Mandarin from various perspectives, several
gaps on this issue could still be observed. First of all, there is a gap of merging direction
being studied. Because the substitution of retroflex sibilants with dental counterparts
was first recognized as a salient feature, most studies focused on the deretroflexion
process in Taiwan Mandarin (M.-C. Li, 1995; C. C. Lin, 1983; Rau & Li, 1994). Few,
however, have paid attention to the realizations of dental sibilants. Although the general
assumption is that dental sibilants are the unmarked segments and the process of turning
marked into unmarked ones is linguistically universal, it is still interesting to investigate
when and how such a process will be reversed. In the case of Taiwan Mandarin, the
substitution of retroflex sibilants for dental ones has been observed from time to time
(e.g., Chung, 2006). It is worthwhile to study the mechanism behind such a
phenomenon.
Second, there is a gap of research materials and research methods. Early studies are
mostly impressionistic, and results are often derived from perceptual observations
(Chan, 1984; Kubler, 1985; M.-C. Li, 1995; C. C. Lin, 1983; Rau & Li, 1994). However,
sound perception is easily affected by various factors, such as ambient segments,
individual voice quality, suprasegmental effects, etc. Later studies on this issue started
to adopt a more objective way and acoustically measured retroflex and dental sibilants
(Jeng, 2006; Tse, 1988, 1998). Nonetheless, the measurements are so far limited to
experimental data. Under reading and citation conditions, subjects tend to be very aware
of their own pronunciation, so retroflex and dental sibilants are usually found to be
clearly distinguished. These totally different results thus create a mismatch between
recent experimental studies and previous impressionistic ones on this issue.
Third, there is a gap of factors being examined. Even though the deretroflexion
phenomenon has long been recognized and discussed, most studies are sociolinguistic
research that centers on a number of extra-linguistic factors such as gender, social class,
education level, etc. (M.-C. Li, 1995; C. C. Lin, 1983; Rau & Li, 1994). Few of them
really focus on linguistic variables. Considering the fact that interview is the most
frequent method for conducting sociolinguistic studies and also the fact that results are
mostly derived from spontaneous speech in the interview, a lot of information regarding
this issue will be masked if linguistic factors are not taken into consideration. For
example, the copula verb shi is such a high-frequency word in Mandarin. Given its low
semantic information and high frequency of use, it is predictable that its onset
consonant will hardly be realized as the canonical retroflex sibilant [ß] in natural speech.
Therefore, the results of simply averaging over the token numbers could possibly cause
difficulty for interpretation.
Given the gaps noted above, in this study, we intend to investigate both dental and
retroflex sibilants more thoroughly and completely, by acoustically measuring the
sibilant realizations in spontaneous speech, with several linguistic factors being
controlled for and looked into.