CHAPTER 4 RESULTS
4.2 Valid canonical sibilant tokens
Table 4.2 presented the detailed distribution of all valid canonical sibilant tokens,
further subdivided by region (Taipei/Kaohsiung), gender (male/female), word class
(content/function), stress level (S0/S1/S2/S3), sibilant type (aspirated affricate/
fricative/unaspirated affricate), and place (retroflex/dental). An additional factor we
added into analyses was the vowel context, distinguishing whether the vowel following
the sibilant was rounded or unrounded. As shown in a number of previous studies (e.g.,
Jeng, 2006), the following rounded vowel context lowers the frequency of previous
sibilants due to coarticulation effects, and such lowering also effectively influences
listeners’ perception (Mann & Repp, 1980). Although vowel context is not the focus of
the present study, its effect on voiceless sibilants, particularly for acoustic studies, is
inevitable and should not be overlooked. Therefore, vowel context was taken as one
factor in our analysis, and all sibilants were categorized according to their vowel
contexts as well.
Table 4.2 The overall valid token distribution of (a) Taipei male (b) Taipei female (c) Kaohsiung male and (d) Kaohsiung female groups (R: retroflex; D:
dental; c: aspirated affricate; s: fricative; z: unaspirated affricate).
(a) Unrounded vowel context Rounded vowel context
c s z c s z
(b) Unrounded vowel context Rounded vowel context
c s z c s z
(c) Unrounded vowel context Rounded vowel context
c s z c s z
(d) Unrounded vowel context Rounded vowel context
c s z c s z
As can be seen from Table 4.2, the distribution of sibilant tokens was imbalanced.
As a result, our analyses were restricted to certain categories when different factors
were examined. In order to understand sibilant realizations as a whole, in our first
analysis, we investigated the effects of region, gender, stress, place, type and context.
Word class, unfortunately, could not be examined together because a lot of cells in the
function word condition lacked sufficient data. Therefore, the first analysis focused on
the content word condition. Furthermore, for the factor stress, not all categories could
be analyzed due to insufficiency of sibilant tokens. Therefore, in this analysis, only S2
and S3 conditions were compared.
A six-way ANOVA was executed, with region (Taipei/Kaohsiung), gender
(female/male), stress (S2/S3), place (retroflex/dental), type (c/s/z) and context
(rounded/unrounded) all as between-subject factors. Results reported significant main
effects of five factors [gender: F(1, 10564) = 16.17, p < .001; stress: F(1, 10564) =
62.93, p < .001; place: F(1, 10564) = 546.15, p < .001; type: F(2, 10564) = 4.95, p < .01;
context: F(1, 10564) = 1408.80, p < .001]. Ten two-way interactions were significant,
including region × gender [F(1, 10564) = 54.27, p < .001], region × stress [F(1, 10564)
= 5.02, p < .05], region × place [F(1, 10564) = 5.00, p < .05], region × context [F(1,
10564) = 27.20, p < .001], gender × place [F(1, 10564) = 139.74, p < .001], gender ×
type [F(2, 10564) = 27.20, p < .001], stress × place [F(1, 10564) = 27.72, p < .001],
place × type [F(2, 10564) = 7.43, p < .001], place × context [F(1, 10564) = 27.72, p
< .01], type × context [F(2, 10564) = 4.73, p < .01]. Additionally, there were six
significant three-way interactions [region × gender × place: F(1, 10564) = 30.21, p
< .001; region × gender × type: F(2, 10564) = 6.97, p < .001; region × stress × place:
F(1, 10564) = 4.23, p < .05; region × type × context: F(2, 10564) = 3.70, p < .05;
gender × type × context: F(2, 10564) = 6.56, p < .005; place × type × context: F(2,
10564) = 7.02, p < .001] and three significant four-way interactions [region × gender ×
place × type: F(2, 10564) = 6.49, p < .005; gender × stress × place × type: F(2, 10564)
= 5.46, p < .005; gender × place × type × context: F(2, 10564) = 7.09, p < .001]. Finally,
one significant five-way interaction was reported: region × gender × stress × place ×
type [F(2, 10564) = 3.45, p < .05].
Figure 4.9 presents the centroid frequency of all canonically realized sibilant
tokens in different conditions, and the error bars represented standard error. As can be
seen, the stress effect was realized differently in three sibilant types. For aspirated
affricate sibilants (c type), Kaohsiung speakers generally had better distinction of
retroflex and dental sibilants in the S3 condition by realizing dental sibilants with higher
frequency [t(240) = -4.42, p < .001]. Regardless of regional differences, stress effect
was significantly shown in female speakers. Specifically, dental sibilants were realized
higher in frequency as well [t(225) = -4.47, p < .001]. As for fricative sibilants (s type),
female speakers differed in how the retroflex/dental contrast was made. Taipei female
speakers did not make a distinction of retroflex and dental sibilants in terms of stress
difference, whereas Kaohsiung female speakers contrasted retroflex and dental sibilants
better in the S3 condition; post hoc independent t test reported that S3 dental sibilants
were significantly realized higher than S2 ones [t(130) = -3.33, p < .001]. The stress
patterns, on the other hand, were more similar for male speakers. In particular, dental
and retroflex sibilants were both realized higher in centroid frequency in the S3
condition [dental: t(420) = -4.99, p < .001; retroflex: t(2274) = -4.99, p < .05], but the
contrast made was larger. With regards to unaspirated affricate sibilants (z type), female
speakers in general exhibited greater distinction in S3 than in S2. In particular, dental
sibilants were realized higher and retroflex sibilants were realized lower in the S3
condition [dental: t(372) = -3.02, p < .01; retroflex: t(565) = -4.46, p < .001]. As for
male speakers, the stress effect was not significantly shown.
Figure 4.9 The mean centroid frequency of three sibilant types (c: aspirated affricate; s:
fricative; z: unaspirated affricate) in the S2 and S3 conditions for speakers of both genders (F: females; M: males) from Taipei and Kaohsiung.
Cross-regional comparisons also revealed several interesting interaction effects. In
particular, for female speakers, Taipei female speakers had significantly lower retroflex
sibilants than Kaohsiung female speakers in the S2 condition [t(2671) = -12.45, p
< .001], but these two groups did not differ in sibilant realizations in the S3 condition.
The opposite pattern was observed for male speakers. Regardless of stress conditions,
Kaohsiung male speakers did larger sibilant contrasts than Taipei male speakers by
having lower retroflex sibilants in all three sibilant types [c: t(1022) = -8.68, p < .001; s:
t(2274) = -11.44, p < .001; z: t(1214) = -11.34, p < .001]. The effect of sibilant type was
particularly found for Taipei male speakers. Specifically, dental sibilants were
significantly higher than retroflex ones for aspirated affricates (c type) [t(766) = -3.26, p
< .005]; retroflex and dental sibilants did not differ significantly for fricatives (s type);
retroflex were, nonetheless, slightly higher than dental sibilants for unaspirated
affricates (z type) [t(1015) = -1.98, p < .05].
Moreover, gender differences were also found to effectively interact with the other
factors. For Taipei speakers, across sibilant types, female speakers had both higher
dental sibilants and lower retroflex sibilants than male speakers in the S2 condition
[dental: t(1148) = 2.52, p < .05; retroflex: t(3416) = -27.54, p < .001]. As for Kaohsiung
speakers, female speakers still made larger contrast, but both their retroflex and dental
sibilants were higher in centroid frequency than male speakers’ [dental: t(1122) = 3.63,
p < .001; retroflex: t(3024) = 2.07, p < .05]. In the S3 condition, however, sibilant type
came into play particularly for Taipei speakers. Females had only significantly lower
retroflex sibilants than males for aspirated affricates (c type) and fricatives (s type) [c:
t(166) = -5.41, p < .001; s: t(335) = -8.85, p < .001]. As for unaspirated affricates (z
type), females made larger contrast than males by having both higher dental sibilants
and lower retroflex sibilants [dental: t(112) = 3.24, p < .005; retroflex: t(227) = -6.32, p
< .001]. On the other hand, no gender interaction was reported for Kaohsiung speakers
in the S3 condition, illustrating that the sibilant contrast made by Kaohsiung female
speakers did not differ significantly from that made by Kaohsiung male speakers.
In our six-way ANOVA analysis, vowel context was shown to have interactions
with gender, place, and type. As shown in Figure 4.10, when followed by rounded
vowels, the centroid frequency of preceding sibilants was significantly lowered. Female
speakers showed significantly larger sibilant contrast in the unrounded condition than in
the rounded condition for both aspirated affricates (p = .054) and fricatives (p < .001).
The same trend was observed for male speakers as well (c: p < .001; s: p < .005). The
lack of larger contrast in unrounded vowel context for unaspirated affricates (z type)
was particularly due to the fact that in the unrounded vowel context, retroflex
unaspirated affricates had higher centroid frequency than the other sibilant types, for
both female speakers (c: p < .005; s: p < .001) and male speakers (c: p < .001; s: p
< .01).
Moreover, the differences between retroflex and dental sibilants were significantly
made by females for all sibilant types and in both vowel contexts. For male speakers, on
the contrary, all sibilant contrasts were made except one condition. That is, no
significant difference between retroflex and dental sibilants was reported for aspirated
affricates (c type) in the rounded vowel context.
Figure 4.10 The mean centroid frequency of three sibilant types (c: aspirated affricate;
s: fricative; z: unaspirated affricate) followed by rounded and unrounded vowels of both female and male speakers.
4.3 Word class
Our analyses so far were all limited to content word data, and the comparison
between content words and function words was still not done yet. Therefore, in this
section, the word class effect was of focus. Again, due to the limitation of data
We first analyzed the data of S2 unaspirated affricates (z type) in the unrounded
vowel context. A four-way ANOVA with region (Taipei/Kaohsiung), gender
(female/male), word class (content/function) and place (retroflex/dental) as
between-subject factors was executed. Results showed significant main effects of all
four factors [region: F(1, 3456) = 11.78, p = .001; gender: F(1, 3456) = 4.75, p < .05;
word class: F(1, 3456) = 23.52, p < .001; place: F(1, 3456) = 229.60, p < .001]. Four
two-way interactions were significant, including gender × place [F(1, 3456) = 80.75, p
< .001], region × gender [F(1, 3456) = 124.77, p < .001], region × word class [F(1,
3456) = 3.87, p < .05] and place × word class [F(1, 3456) = 4.48, p < .05] . One
three-way interaction was reported [region × gender × place: F(1, 3456) = 68.77, p
< .001].
As shown in Figure 4.11, retroflex and dental sibilants were distinguished in both
content word and function word conditions [content: t(1989) = -12.61, p < .001;
function: t(1479) = -7.98, p < .001]. Moreover, for speakers from both regions, it was a
general trend that both retroflex and dental sibilants in the content word condition were
realized higher in frequency than those in the function word condition [retroflex: t(2235)
= 4.25, p < .001; dental: t(1233) = 5.48, p < .001]. As could be observed, the distinction
between retroflex and dental sibilants was made greater in the content word condition.
Figure 4.11 The mean centroid frequency of content and function word unaspirated
affricates (z type) in unrounded vowel context of speakers from Taipei and Kaohsiung.
Our second analysis examined the data of S2 fricatives (s type) in the rounded
vowel context. Figure 4.12 shows the realizations of retroflex and dental sibilants in the
content word and function word conditions of the four speaker groups. As can be seen,
particularly for Taipei male speakers, retroflex sibilants were realized a lot higher than
dental sibilants in the function word condition. A closer examination of the distribution
of Taipei males’ sibilant tokens revealed that such a phenomenon actually resulted from
speaker variability. Table 4.3 gives the token number and the percentage of S2 fricatives
(s type) in the rounded vowel context contributed by each Taipei male speaker.
Specifically, in the function word condition, retroflex sibilant tokens were contributed
mainly by speaker CZX, while dental sibilants were contributed mostly by speakers
HSK and YYS. Figure 4.13 shows the mean frequency range of each Taipei male
speaker. As can be seen, the frequency ranges of CZX and JXW are about 2500 Hz
condition were mostly contributed by CZX, the mean centroid frequency was thus high.
On the other hand, HSK and YYS contributed about 85% of dental sibilant tokens in the
function word condition, thus leading to low centroid frequency. In this regard, when
analyzing S2 fricatives (s type) in the rounded vowel context, we only included data of
HSK and YYS in the function word condition for Taipei male speakers, owing to the
fact that these two speakers had more comparable frequency ranges and the total
retroflex and dental sibilant tokens of these two speakers were sufficient for analyses.
Figure 4.12 The mean centroid frequency of content and function word fricatives (s
type) in the rounded vowel context of both female and male speakers from Taipei and Kaohsiung.
Table 4.3 The number and percentage (in paranthesis) of S2 fricative (s type) sibilant tokens in the rounded context contributed by each Taipei male speaker.
Speaker
Word class Place CZX HSK JXW YYS
Retroflex 108 (38%) 33 (12%) 65 (23%) 76 (27%) Content
Dental 24 (32%) 6 (8%) 15 (20%) 30 (40%) Retroflex 64 (65%) 1 (1%) 13 (13%) 21 (21%) Function
Dental 9 (12%) 30 (41%) 2 (3%) 33 (45%)
Figure 4.13 The mean centroid frequency of content and function word fricatives (s
type) in the rounded vowel context of each Taipei male speaker.
Figure 4.14 presents the revised data after excluding speaker CZX and JXW from
Taipei male group. A four-way ANOVA with region (Taipei/Kaohsiung), gender
(female/male), word class (content/function) and place (retroflex/dental) as
between-subject factors was carried out. The main effects of region and place were
significant [region: F(1, 1413) = 43.24, p < .001; place: F(1,1413) = 130.40, p < .001].
There were two two-way interactions, including region × gender [F(1, 1413) = 18.00, p
interaction was reported [region × gender × place: F(1, 1413) = 7.56, p < .01]. None of
the four speaker groups showed significant effects of word class.
Figure 4.14 The mean centroid frequency of content and function word fricatives (s
type) in the rounded vowel context, with speaker CZX and JXW excluded from Taipei male group.