CHAPTER 2 LITERATURE REVIEW
2.4 Prosodic prominence
Studies on prosodic prominence started early, due to the fact that it serves crucial
linguistic functions. In English, for example, prominence at the word level contrasts
lexical meaning (e.g., INcline vs. inCLINE); at the sentence level, locations of prosodic
prominence enable speakers to convey different meanings or elements of focus (e.g., he
HIT John vs. HE hit John vs. he hit JOHN). To understand what constitutes perceptual
saliency of prosodic prominent elements, Lieberman (1960), as an first attempt,
measured a number of dimensions on stressed syllables, in which he identified three
acoustic correlates, including fundamental frequency, amplitude and duration. While
these three cues should be more or less considered to be closely associated with
suprasegmental features, more recent studies start to pay attention to the relationship
between prosodic structure and fine-grained phonetic details. Specifically, researchers
are interested in how prominence, defined at the suprasegmental level, influences
phonetic features at the segmental level.
One of the early identified cues in the segmental level is vowel quality. In
particular, vowels are articulated with greater gestural efforts in stressed or accented
positions. This effect has been saliently verified both from the articulatory perspective
(e.g., Beckman & Edwards, 1994; de Jong, 1995) and from the acoustic perspective
(e.g., Cho, 2005; van Bergem, 1993). Similarly for consonants, it is discovered that in
prosodically prominent conditions, greater distinction could be observed. For example,
in terms of voicing, voiced and voiceless stops are better contrasted in the accented
condition, which is reflected in a number of acoustic measurements such as VOT as in
English (Cole, Kim, Choi, & Hasegawa-Johnson, 2007), or degree of prevoicing in
Dutch (Cho & McQueen, 2005). Such a prosodic effect on segmental realizations is
commonly referred to as prosodic strengthening, depicting the phenomenon that
linguistic contrasts are maximized or maintained in prosodically stronger conditions.
Since stressed and accented syllables are indeed in prosodically strong positions, it is
conceivable that the strengthening effect could be generally observed for both vocalic
and consonantal segments.
As opposed to prominence, reduction is a stress level that generally refers to a
diminishing process of acoustic cues resulting from articulartory economy in
prosodically non-prominent position. One common phenomenon of reduction is the
assimilation effect. Specifically, in reduced conditions, segments usually become more
similar to their surrounding segments and they are also more likely to lose their original
distinctive phonetic features. van Bergem (1993), for instance, suggested that vowels in
non-stressed, unaccented syllables move towards a position similar to the preceding and
following consonants, instead of merely centralizing. As for consonants, the reduction
process is basically comparable with that of vowels (van Son & Pols, 1999). In the
perceptual aspect, Duez (1995), in a study of voiced stops in French spontaneous speech,
indicated that prosodic prominence had an effect on consonant identification. In
particular, voiced stops occurring in non-prominent syllables are significantly less
successfully recognized than those in prominent syllables.
As for prosodic prominence in Mandarin, Chao (1968) proposed three levels of
stress, including contrasting stress, weak stress, and normal stress. Accordingly,
contrastive stress signifies the condition where speakers intend to contrast certain
elements in the sentence. For instance, in the sentence Bushi Huang xiansheng, shi
WANG xiansheng ‘It’s not Mr. Huang; it’s Mr. Wang.’, it can be seen that the speaker
puts emphasis on the surname Wang. The syllable to be contrasted is thus said to
possess contrastive stress. Contrastive stress is generally realized with a wider pitch
range and longer duration, and usually with increased loudness. Weak stress is
particularly associated with neutral tone syllables by Chao. A great number of
grammatical suffixes in Mandarin (e.g., -de ‘possessive marker’) are of neutral tones.
Moreover, neutral tones also have the function of distinguishing lexical meaning (e.g.,
dong1xi1 ‘east and west’ vs. dong1xi0 ‘thing’). According to Chao, the name “neutral tone” is given because the original tonal range is “flattened to practically zero” (p.44),
and the pitch height of the neutral tone syllable actually depends on the tone of its
previous syllable. As opposed to contrastive stress, neutral tone syllables are relatively
short in duration. Additionally, other identified acoustic features include low intensity,
vowel centralization, and consonant weakening, as mentioned by Chao and in other
researchers (e.g., Shi, 1994). As for normal stress, by Chao’s definition, syllables that
have neither contrastive stress nor weak stress belong to this category.
In addition to the three stratifications of stress, Chao (1968) further stated that
stress in Mandarin is manifested primarily by pitch range enlargement and duration
lengthening, and only secondarily by loudness. Later studies took Chao’s idea and
conducted a number of experiments to testify his observations. For example, in his
acoustic study on sentence stress in Mandarin, Jin (1996) found that pitch and duration
are truly the two most relevant correlates of sentence stress, with pitch ranked even
higher than duration. The systemization of stress levels is found in Pan-Mandarin ToBI,
developed by Peng, et al. (2007). Four levels of stress are identified. These four levels
and the corresponding depictions are shown in Table 2.2.
Table 2.2 Relative levels of stress in Pan-Mandarin ToBI (Peng, et al., 2007).
Stress Description
S0 syllable with lexical neutral tone
S1 syllable that has lost its lexical tonal specification (e.g., in a weakly-stressed position)
S2 syllable with substantial tone reduction (e.g. undershooting of tonal target with duration reduction)
S3 syllables with fully realized lexical tone
As can be seen, the stratification of stress in Pan-Mandarin ToBI is actually similar
to that of Chao’s (1968), except for dividing weak stress into two levels (S0 and S1),
depending on whether the syllable is lexically specified as a neutral tone syllable. In
addition, it is obvious that tonal realizations are taken as the sole criterion for
identifying stress in Pan-Mandarin ToBI. Nonetheless, it should be noted that there are
studies indicating that although pitch is an important cue for stress, it is not a necessary
cue. For instance, Shen (1993) utilized natural speech and modified the acoustic
parameters in order to see whether stress perception is harmed in lack of the F0 cue.
Results of perceptual experiments showed that without F0 information, listeners are still
able to identify stress locations. In this regard, it was concluded by Shen that in
Mandarin, no one cue is indispensable; instead, stress prominence is marked by the
integration of all relevant correlates.