行政院國家科學委員會專題研究計畫 成果報告
地域及語體對台灣地區國語一聲調閾之影響
研究成果報告(精簡版)
計 畫 類 別 : 個別型
計 畫 編 號 : NSC 95-2411-H-002-046-
執 行 期 間 : 95 年 08 月 01 日至 96 年 09 月 30 日
執 行 單 位 : 國立臺灣大學語言學研究所
計 畫 主 持 人 : 馮怡蓁
計畫參與人員: 博士班研究生-兼任助理:陳萱芳
碩士班研究生-兼任助理:林欣怡、黃宜萱、莊育穎、吳怡臻
報 告 附 件 : 出席國際會議研究心得報告及發表論文
處 理 方 式 : 本計畫可公開查詢
中 華 民 國 96 年 12 月 27 日
行政院國家科學委員會補助專題研究計畫
■ 成 果 報 告
□期中進度報告
地域及語體對台灣地區國語一聲調閾之影響
計畫類別:■ 個別型計畫 □ 整合型計畫
計畫編號:NSC 95-2411-H-002-046-
執行期間:2006 年 08 月 01 日至 2007 年 09 月 30 日
計畫主持人:馮怡蓁
共同主持人:無
計畫參與人員:陳萱芳、林欣怡、黃宜萱、莊育穎及吳怡臻
成果報告類型(依經費核定清單規定繳交):■精簡報告 □完整報告
本成果報告包括以下應繳交之附件:
□赴國外出差或研習心得報告一份
□赴大陸地區出差或研習心得報告一份
■出席國際學術會議心得報告及發表之論文各一份
□國際合作研究計畫國外研究報告書一份
處理方式:除產學合作研究計畫、提升產業技術及人才培育研究計畫、列
管計畫及下列情形者外,得立即公開查詢
□涉及專利或其他智慧財產權,□一年□二年後可公開查詢
執行單位:國立台灣大學語言學研究所
一、 中文摘要
本計劃探討方言差異影響台灣地區國語之調閾與音閾下降的情形。我們總共徵求了
12位語言使用者,其中六位來自北部方言區(「標準」方言)
,六立來自中部方言區(「非
標準」方言),以進行一項閱讀作業。一聲的目標字皆放置於裝載句中,共有三種不同
的句位——句首、句中與句末。應變數則量測目標字之基頻極大值。結果顯示,北部方
言一聲的調閾高於中部方言,而且此種現象男性較女性明顯。音閾下降的強度亦受性別
與方言影響。中部男性語言使用者音閾下降坡度較北部男性為緩,而中部女性語言使用
者則傾向以提高起始音高的方式,使其音閾下降坡度較北部女性為陡,顯示女性使用者
較男性對於自身所使用之方言發音有自覺,因此較易以較大的音閾來過度矯正其非標準
口音。
關鍵詞:國語、聲調調閾、一聲
二、 英文摘要
This study investigated how dialectal variations can influence the realization of tonal
register and declination pattern in Taiwan Mandarin. Twelve speakers, six from the Northern
dialect (standard) and six from the Central dialect (nonstandard), were recruited for a reading
task. Target syllables of Tone 1 (T1) were embedded in a carrier sentence in three different
sentential positions, initial, medial, and final. The dependent measure was the F
0maximum of
the target syllables. Results showed that the Northern dialect had higher T1 register than the
Central dialect, with the effect being more prominent in males than females. The magnitude
of declination was also a function of gender and dialect. Central male speakers produced
milder declination slopes than their Northern counterparts while Central female speakers
tended to have steeper slopes instead by raising the initial starting pitch, indicating that in the
Central dialect, female speakers were more aware of their vernacular speech status compared
to male speakers and tended to hypercorrect their nonstandard accent by using a larger pitch
range.
Keywords: Guoyu, tonal range, Tone 1
The linguistic situation in Taiwan is anything but monolithic. Although the official
language is Mandarin, nearly 80% of the population is ethnically Min, who speaks a variant
of Southern Min (Chen, 1989; Cheng, 1985). Therefore, Min acts as a powerful substrate
language for Taiwan Mandarin.
Both Mandarin and Min are tone languages. Mandarin has four lexical tones, which are
high level (Tone 1), mid dipping (Tone 2), low dipping/falling (Tone 3), and high-falling
(Tone 4) (Chao, 1968; Fon & Chiang, 1999; Fon, Chiang, & Cheung, 2004), while Min has
five long tones, which are high level (Tone 1), high falling (Tone 2), low falling (Tone 3),
mid dipping (Tone 5), and mid level (Tone 7). Although the two languages have
comparable tonal number and categories, the tonal range used in Min is somewhat lower than
that in Mandarin (Chen, 2005; Hsu, 2006).
Due to the Mandarin-only language policy enforced by the government between 1945
and 1990, the relative statuses of the two languages have become unequal (Huang, 1993).
Mandarin is promoted as a high language and is used extensively in public domains and
formal contexts while Min is demoted as a low language and is often limited to private
domains and informal contexts. In addition, the degree of bilingual proficiency is also
geographically imbalanced. As Taipei is the political and economic center of the country,
more people became monolingual Mandarin speakers or unequal Mandarin-Min bilingual
speakers who are dominant in Mandarin. On the other hand, in cities that are more down
south, there are more people that are equal bilingual speakers of Mandarin and Min, or even
unequal bilinguals that are dominant in Min.
Therefore, in this study, we would like to investigate whether there is any difference in
the tonal range by Mandarin speakers of different dialects. Specifically, the Northern dialect,
which is the standard variety, and the Central dialect, a nonstandard variety, were chosen.
The Northern dialect is spoken in the Taipei Metropolitan area while the Central dialect is
spoken in the Taichung Metropolitan area.
四、 研究目的 (Aims)
There are three specific aims in this study. First of all, we would like to explore possible
dialectal differences in tonal register. If the low tonal range of Min could be carried over to
that of Mandarin, one would expect this effect to be stronger in the Central dialect than the
Northern dialect, as there are more fluent Min speakers in the former area than the latter.
Therefore, the tonal register of the Central dialect should be lower than that of the Northern
dialect.
Secondly, if tonal register is indeed lower in nonstandard varieties, then one would
suspect this difference to also influence the declination pattern, since declination exerts
differential effects on high and low tonal targets, being more prominent on the former than
the latter (Ladd, 1988). In other words, one would expect the topline declination of the
Northern variety to be steeper than that of the Central variety, since the latter has a more
restricted and lower tonal register.
Thirdly, we would like to explore whether gender would also influence tonal realization
with regards to register and declination patterns in the two dialects. Sociolinguistic studies
have demonstrated that women on average are more likely to conform to standard linguistic
forms than men (Trudgill, 2000). If so, we would expect women in the Central dialect to be
closer to those in the Northern variety than their male counterparts.
五、 文獻探討 (Literature review)
In the early 1970s, acoustic measurements of the four tones in Mandarin are fairly
consistent with their names (Howie, 1976). Tone 1 (T1), being a high-level tone, is on
average 150 Hz, Tone 2 (T2), a high-rising tone, ends at about 150 Hz, Tone 3 (T3), a
low-dipping tone, starts at around 135 Hz, and Tone 4 (T4), a high-falling tone, starts at about
150 Hz.
1In other words, acoustic measurements match perfectly with the descriptive names.
T4 starts high, at a pitch height that is equivalent to the average pitch height of T1 and the
final pitch height of T2.
However, Shih’s (1988) results showed the matching relationship between acoustic
measurements and descriptive names seems to have changed.
2Of the four tones in Mandarin,
the initial of T4 is the highest, with an average of 290 Hz, and T1 and T2, although termed
high-level and high-falling, respectively, do not live up to their fame. The average pitch
height of T1 is only 260 Hz, and the final pitch height of t2 is only 210 Hz. At around the
same time, Tseng (1990) also did a series of studies on Mandarin tones. Results showed that
the beginning of T4 and the end of T2 are about 245 Hz and 250 Hz, respectively.
3On the
other hand, the average pitch of T1 is 215 Hz and the beginning pitch of T3 is 135 Hz. Fon
(1997) also studied a Taipei female Min-origin speaker whose native language is Mandarin
and found that the beginning pitch of T4 is still the highest, around 255 Hz, while the average
of T1 is 240 Hz, the ending pitch of T2 is 220 Hz, and the beginning pitch of T3 is 220 Hz.
1 Howie (1976) studied a male speaker. 2 Shen (1988) studied a female speaker.
From above studies, one can conclude that within ten to twenty years after Howie’s (1976)
study, the tonal range of high tones in Mandarin, such as T1 and T2, is shifted lower.
However, individual differences still exist. Therefore, we have suggested that revision should
be made for Taiwan Mandarin using Chao’s five-point tone letter system (1956; 1968).
Instead of the original 55, 35, 214, and 51, it should become 44, 323, 312, and 42, to match
the acoustic realization (Fon & Chiang, 1999; Fon, 1997).
Since acoustic measurements are very time-consuming, not many studies included large
amount of data to compare tonal range variations in the Mandarin tonal system.
4Fon and
Hsu (2007) studied five young Min-origin Mandarin speakers (ages between 20 and 25) in a
reading task to compare T2 and T3 variations in various syllable structures and sentence
positions. Results showed that aside from sentence-final accented positions in which the
ending pitch of T2 is somewhat higher than the initial of T3 (110Hz : 100Hz), tones in
isolation and in other positions have comparable pitch height at these two reference points
(around 120 Hz), indicating that the ending pitch of T2 has already lowered to the mid pitch
range, and there is a need to revise the descriptive name of “high-rising” tone. S. Hsu (2006)
collected 24 college students’ read speech, and also found that Mandarin-Min bilingual
speakers tend to have lower tonal range for T2 than Mandarin monolingual speakers. The
former group is inclined to realize T2 in the low pitch range, accounting for 21% of its total
T2, while the latter group only realized 11% of T2 in the low pitch range.
In addition, H. Hsu (2006) also recorded 48 Mandarin speakers’ read speech in order to
investigate the relationship among T4 range variation, age, and ethnic group. Results showed
that in general, elder speakers (ages 45 to 54) have wider tonal ranges than younger ones
(ages 15 to 24) (40 Hz : 18 Hz). Interestingly, there are also variations among the elder
speakers. Those whose parents are ethnically non-Min, or whose father is non-Min and
mother is Min tend to have a wider tonal range than those whose parents are both Min (45
Hz : 33 Hz). However, such an interaction effect was not found in the young group,
indicating that the ethnic difference in Mandarin tones is diminishing.
六、 研究方法 (Methods)
1. 受試者 (Subjects)
Twelve participants between ages 19 and 24 took part in this study. Half of them were
from the Taipei Metropolitan area (the Northern dialect), and half of them were from the
Taichung Metropolitan area (the Central dialect). Within each dialect group, there was an
equal gender split. All of the subjects were Mandarin-Min bilinguals that were ethnically Min,
but the Taipei speakers could not speak Min fluently.
2. 實驗刺激 (Stimuli)
Twenty-seven Tone 1 (T1) syllables representative of Mandarin phonotactics were
chosen as stimuli, including 6 voiceless obstruent-initials (e.g., han
1[xan] ‘charmingly
naive’), 15 sonorant-initials (e.g., la
1[la] ‘pull’), and 6 vowel-initials (e.g., wu
1[u] ‘black’).
T1 syllables were chosen because they contain only high tonal targets and thus potential
register and declination differences could be most clearly observed without being clouded by
the presence of low tones. The relatively high pitch also guarantees modal voice and thus
successful pitch extraction. Syllables were placed in three comparable carrier sentences, so
that they occurred in sentence-initial, -medial, and -final positions (Table 1). Carrier
sentences were designed so that syllables immediately before the target in medial and final
positions ended mid, which would also be the starting point for the initial position according
to the PENTA model (Xu, 2005). In total, 27 (stimuli) × 3 (positions) = 81 sentences were
recorded.
Table 1: Carrier sentences used in this study.
Carrier Sentence
Initial ‘X’ zhe
4ge
0zi
4hen
3nan
2nian
4.
‘X’ this word is very hard to read.
Medial zhe
4ge
0‘X’ zi
4hen
3nan
2nian
4This word ‘X’ is very hard to read.
Final
zhe
4ge
0zi
4shi
4nian
4‘X’
This syllable indeed reads ‘X’
3. 實驗儀器 (Equipment)
Recordings were done using a SONY PCM-M1 Digital Audio Recorder with Maxell
R-64 DA 60 min DAT tapes and a SHURE SM10A head-mounted microphone.
4. 實驗過程 (Equipment)
The experiment was conducted in a quiet room. Speakers were asked to read aloud the
semi-randomized stimuli using natural intonation at a normal rate. The whole process took
about 15 minutes. The original recordings had a sampling rate of 48 kHz, which were
subsequently downsampled to 16 bit 22050 kHz using Adobe Audition 1.5.
5. 實驗分析方法 (Analysis)
The recordings were hand-labeled using Praat 4.6 (Boersma & Weenink, 2007). A Praat
script was written for automatic pitch extraction on the voiced portion of the syllable, which
is considered the measurable domain for tones (Chao, 1956; 1968; Wang, 1967; Xu, 1998).
For obstruent-initial syllables, the starting point of a tone was determined by the onset of the
voice bar after the obstruent, which was voiceless in this study. For the rest, the starting point
began from the onset of the syllable, as the whole syllable was voiced. The ending point was
always the offset of the voice bar. Occasional syllable-initial or -final glottalized portions
caused by voice fry were not included for pitch extraction. Extracted pitch tracks were later
hand-checked and hand-corrected for doubling and halving through pitch period calculation,
and were interpolated and smoothed using Praat functions afterwards. A second Praat script
was written to extract F
0maximum for the target stimuli and the syllable zhe
4‘this’, which,
being a high-falling tone, acted as a reference point for the sentence.
七、
結果 (Results)
Subjects were asked to rate their Mandarin and Min fluency on a scale of 1 to 7. Since
Mandarin was their native and most frequently used language, Min fluency was measured by
dividing Min fluency scores by Mandarin fluency scores. As shown in Figure 1, Taipei
speakers in general had lower degrees of Min fluency than Taichung speakers, with female
speakers being even less fluent than males. On the other hand, not much gender difference
was found in Taichung speakers. Both males and females were fairly fluent in both
languages.
Figure 1: A bar graph of self-rated Min fluency relative to subjects’ Mandarin fluency.
1. 「這」與目標字之基頻極大值 (F
0maximum of zhe
4and the target syllable)
Figure 2 shows scatter plots of the F
0maximum of zhe
4and the target syllables for the
two dialects of Mandarin in sentence-initial, -medial, and -final positions. The cluster in the
upper right corner in each graph represents female data and the cluster in the lower left corner
represents male data. As can be seen from the figure, Taichung male speakers used lower F
0maximum than Taipei speakers for both zhe
4and target syllables, indicating an overall
lowering in the tonal register. On the other hand, for female speakers, the distinction was not
as clear. The overall pitch range for zhe
4was fairly similar for the two dialect groups, with
Taichung speakers being even slightly higher than Taipei ones. This was especially
prominent in sentence-medial positions. However, for target syllables, Taipei speakers were
still higher in pitch than those of Taichung.
A Gender (2) × Dialect (2) × Position (3) × Syllable (zhe
4vs. target) four-way mixed
ANOVA was performed to confirm the above observations. The between-factors were
Gender and Dialect. Results showed that all of the main effects were significant [Gender: F(1,
318) = 18.94, p < .0001, η
̂
2= .96; Dialect: F(1, 318) = 18.68, p < .0001, η
̂
2= .06; Position:
F(2, 636) = 172.60, p < .0001, η
̂
2= .35; Syllable: F(1, 318) = 1498.8, p < .0001, η
̂
2= .83].
< .0001, η
̂
2= .03 ; Position × Dialect: F(2, 636) = 4.79, p < .01, η
̂
2= .02; Syllable × Gender:
F(1, 318) = 348.59, p < .0001, η
̂
2= .52; Syllable × Dialect: F(1, 318) = 33.72, p < .0001, η
̂
2= .10; Position × Syllable: F(1.97, 624.89) = 665.60, p < .0001, η
̂
2= .68]. In addition, all of
the three-way interactions involving Syllable were significant [Syllable × Gender × Dialect:
F(1, 318) = 68.42, p < .0001, η
̂
2= .18; Syllable × Gender × Position: F(1.97, 624.89) =
128.69, p < .0001, η
̂
2= .29; Syllable × Dialect × Position: F(1.97, 624.89) = 15.45, p < .0001,
η
̂
2= .05]. Finally, the four-way interaction was also significant [F(1.97, 624.89) = 29.90, p
< .0001, η
̂
2= .09].
Post hoc independent t-tests regarding the Dialect effect indicated that the F
0maximum
of Taipei male speakers was significantly higher than their Taichung counterparts across both
syllable types and sentential positions (p < .0001). However, this was not the case for female
speakers. In the initial position, there was no difference between Taipei and Taichung
speakers in zhe
4or the target syllables. In the medial and final positions, target syllables of
Taipei speakers were significantly higher than those of Taichung (Medial: p < .0001; Final: p
< .05), while for zhe
4, Taichung speakers were in turn higher than Taipei speakers (Medial: p
< .0001; Final: p < .05).
2. 頂線音閾下降 (Topline declination)
Figure 3 shows the topline declination pattern of male and female speakers in the two
dialect groups. For male speakers, the declination trend was steeper in the Northern dialect in
sentence-medial and -final positions. Female speakers demonstrated an exactly opposite trend,
with the Central dialect showing a steeper topline than the Northern variety, especially in
sentence-medial positions.
Figure 2: Scatter plots of F
0maxima of zhe
4and target syllables in sentence (a) -initial,
(b) -medial, and (c) -final positions.
Figure 3: Mean F
0maxima of zhe
4and target syllables of (a) male and (b) female speakers.
In order to confirm what was observed above, a Gender (2) × Dialect (2) × Position (3)
three-way mixed ANOVA was conducted on degree of declination, defined by F
0maximum
difference between the reference zhe
4and the target syllable. Results showed that all of the
main effects were significant [Gender: F(1, 318) = 348.91, p < .0001, η
̂
2=.52; Dialect: F(1,
318) = 33.59, p < .0001, η
̂
2= .10; Position: F(1.97, 625.11) = 665.13, p < .0001, η
̂
2= .68].
All of the two-way interactions were also significant [Position × Gender: F(1.97, 625.11) =
128.56, p < .0001, η
̂
2= .29; Position × Dialect: F(1.97, 625.11) = 15.52, p < .0001, η
̂
2= .05;
Gender × Dialect: F(1, 318) = 68.61, p < .0001, η
̂
2= .18]. The three-way interaction was
significant as well [F(1.97, 625.11) = 19.92, p < .0001, η
̂
2= .09].
larger in Taipei male speakers than those of Taichung in medial and final positions, the
difference being greater in the latter than the former position (Medial: p < .0001; Final: p
= .001). For female speakers, the difference was also significant in medial and final positions
(p < .0001), but the trend was the opposite of male speakers. Taichung females tended to
have a larger F
0drop than Taipei females, with the difference being greater in the
sentence-medial than the final position. In the initial position, however, both genders show no
dialectal differences.
八、 討論 (Discussion)
Results in this study showed that variation did exist between the two dialects of Taiwan
Mandarin for T1. In general, the Central dialect tended to have a lower tonal register than the
Northern dialect. This is consistent with our predictions. Since Taichung speakers were more
fluent in Min, their Mandarin tonal register would more likely be influenced and became
lower in pitch. However, such an effect was more prominent in male than in female speakers.
In female speakers, only sentence-medial and -final positions showed such an effect, while in
male speakers, all three positions showed the same trend. The gender differences could not
have been due to differential levels of Min proficiency, as speakers from the Central dialect
had approximately the same level of Min fluency, regardless of gender. In other words, what
was underlying the difference between male and female speakers was more likely to be a
pure gender issue. Female speakers were more sensitive to the differences between standard
and nonstandard forms and were more likely to conform themselves to the social norm. As a
consequence, they were less inclined to show obvious regional traits.
With regard to the declination pattern, dialectal differences were also found. For both
genders, only sentence-medial and -final positions showed reliable differences. However, the
direction of the effect was exactly opposite for the two genders. Male Taichung speakers had
a milder declination topline than their Taipei counterparts, while female Taichung speakers
had a steeper declination topline instead. The pattern of male speakers was in line with our
predictions. Since tonal register was lower in the Taichung dialect, and since higher tonal
ranges were more elastic than lower ones (Ladd, 1988), the declination range would naturally
be more restricted and thus the slope of the topline would be shallower. However, in female
speakers, the Central dialect showed steeper slopes than the Northern dialect. If this reverse
pattern could also be attributed to differential gender sensitivity to the linguistic norm, then
female Taichung speakers were actually counteracting regional characteristics by
over-correction. Interestingly, this was done not by raising the overall pitch range of the
sentence, using the same mechanism employed by their Taipei counterparts, but was instead
achieved by only raising the initial starting point, that of zhe
4, demonstrating a partial-raising
of the tonal range.
九、 結論 (Conclusion)
This study showed that dialectal differences existed in the tonal range and declination
pattern of Taiwan Mandarin, with the nonstandard dialect demonstrating a phonetically lower
high tone, and the declination slope shallower than the standard variety. It was assumed that
such a difference could be attributed to differential influences from the substrate language
Min. Fulfilling sociolinguistic predictions, female speakers showed a lesser degree of such
dialectal differences and were more likely to counteract regional characteristics by
over-corrections. In order to further confirm this trend, one plans to extend the scope of the
study to include the Southern dialect. If such dialectal differences were indeed due to
influences from Min, then one should be able to see the same pattern in the Southern dialect,
perhaps even more so, as Min is even more commonly used in the area.
十、 參考文獻 (References)
CHAO, Y. R. (1956). Tone, intonation, singsong, chanting, recitative, tonal composition, and
atonal composition in Chinese. In M. Halle (Ed.), For Roman Jakobson (pp. 52-59).
The Hague: Mouton.
CHAO, Y. R. (1968). A Grammar of Spoken Chinese. Berkeley: University of California
Press.
CHEN, S. H. (2005). The effects of tones on speaking frequency and intensity ranges in
Mandarin and Min dialects. Journal of the Acoustical Society of America, 117,
3225-3230.
CHENG, R. L. (1985). A comparison of Taiwanese, Taiwan Mandarin, and Peking Mandarin.
Language, 61, 352-377.
FON, J., & CHIANG, W.-Y. (1999). What does Chao have to say about tones? -A case study
of Taiwan Mandarin. Journal of Chinese Linguistics, 27, 15-37.
FON, J., CHIANG, W.-Y., & CHEUNG, H. (2004). Production and perception of two
dipping tones (T2 and T3) in Taiwan Mandarin. Journal of Chinese Linguistics, 32,
249-280.
FON, J., & HSU, H. (2007). Positional and phonotactic effects on the realization of dipping
tones in Taiwan Mandarin. In C. Gussenhoven and T. Riad (Eds.), Phonology and
phonetics, Tones and tunes: Phonetic and behavioural studies in word and sentence
prosody (pp. 239-269). Berlin: Mouton de Gruyter.
HOWIE, J. M. (1976). Acoustical Studies of Mandarin Vowels and Tones. Cambridge:
Cambridge University Press.
HUANG, S. (1993). Yuyan, shehui yu zuqun yishi--Taiwan yuyan shehuixue de yanjiu
[Language, society, and ethnic identity--Studies in language sociology in Taiwan].
Taipei: Crane.
LADD, D. R. (1988). Declination 'reset' and the hierarchical organization of utterances. The
Journal of the Acoustical Society of America, 84, 530-544.
SHIH, C.-L. (1988). Tone and intonation in Mandarin. Working Papers of the Cornell
Phonetics Laboratory, 3, 83-109.
TRUDGILL, P. (2000). Sociolinguistics: an introduction to language and society. New York:
Penguin Books.
TSENG, C.-Y. (1990). An Acoustic Phonetic Study on Tones in Mandarin Chinese. Taipei:
Institute of History & Philology, Academia Sinica.
WANG, W., S-Y. (1967). Phonological features of tone. International Journal of American
Linguistics, 33, 93-105.
XU, Y. (1998). Consistency of tone-syllable alignment across different syllable structures and
speaking rates. Phonetica: International Journal of Speech Science, 55, 179-203.
XU, Y. (2005). Speech melody as articulatorily implemented communicative functions.
Speech Communication, 46, 220-251.
十一、 與執行本計畫相關之著作
Fon, J.*, Hsu, H.-j., Huang, Y.-H., & Chen, S. (2007). The effect of onset and position in the
realization of Tone 1 in two dialects of Taiwan Mandarin. Proceedings of the 16
thInternational Congress of Phonetic Sciences, 1297–1300. [請見附錄]
pattern of Taiwan Mandarin. Proceedings of the 4
thInternational Conference on
Speech Prosody, Cambinas, Brazil.
十二、 計畫成果自評
1. 研究內容與原計劃相符程度
原計劃內容乃為探討國人對於國語中一聲調閾下降的情形。研究內容與原計劃
大致上相符。
2. 達成預期目標情況
原計劃預計完成之目標包含:閱讀相關典籍、購買儀器設備、訓練實驗執行助理、
完成實驗刺激挑選、執行實驗、完成Praat程式寫作、擷取相關數據、圖表製作、統計
分析、結果撰寫及發表。本研究計劃大致上達成預期目標。不過,由於受限於經費,故
將原本擬進行之六個實驗,刪減為二個(實驗一與實驗四),為本計劃在現實考量之下
妥協之處,所幸所得成果仍可大致展現原計劃所欲達成之目標。
3. 研究成果之學術或應用價值
就學術研究而言,此一計劃的研究成果,可以使我們更進一步地了解到台灣地區
國語一聲的調閾變化,及調閾下降的成因與分布狀況。同時,我們也盼能因此拋磚引玉,
吸引更多的學者對於國語特有的現象,進行研究,讓我們對於這個演變中的台灣語言,
有更精確的掌握與認識。
在應用方面,如前所提,本計劃的研究成果對於目前正如火如荼進行研發的語音
科技,亦有其一定的貢獻。語音科技若要讓所產生出來的語句更自然,判讀的語句正確
率更高,建立非正式口吻之閱讀語料庫是不可或缺的一環。許多北方官話的商業化語音
科技產品,皆以大陸普通話為模型,於建立北方官話的語音系統後,再調整一些特定參
數,將系統轉換成為較適合國語的模型。由於本計劃的目的之一即是在建立國語一聲調
閾下降的分布模型,因此所得研究成果將可用以改良目前的國語語音科技,使其更自
然、辨識率更高。目前中國大陸因2008年奧運之故,已於數年前將語言訊號處理視為國
家發展重點之一。我們若不急起直追,五年之後,將很難與其爭鋒。
4. 是否適合在學術期刊發表或申請專利
本計劃適合於語音學相關期刊發表,目前部份研究成果已陸續投稿於知名期刊。
本人亦擬於未來二三年內,將其餘成果整理發表,並繼續從事相關研究,以便能夠有一
系列更完整的研究成果呈現。
5. 主要發現或其他有關價值等
本研究主要發現有二:其一、就聲調而言,台灣地區國語已與大陸的普通話不
同,自成一個系統,有其特有的聲調調型。一聲的調閾隨著台語的影響程度而下降,
並有地域與性別上的不同。其二、台灣地區國語歷經數十年來的演變,不僅已標準化,
且已發展出各地方言特色。北部方言與中部方言呈現調閾不同的現象,對於四聲之間
的區辨性,將有極大的影響,值得近一步探討。雖然北部方言由於是標準方言的關係,
語言變動的情形較為緩慢,但是隨著中南部人口大量移入北部,標準方言亦可能因此
有較劇烈的變化,是值得持續觀察的現象。
THE EFFECT OF ONSET AND POSITION IN THE REALIZATION OF
TONE 1 IN TWO DIALECTS OF TAIWAN MANDARIN
Janice Fon,
1Hui-ju Hsu,
2Yi-Hsuan Huang,
1& Sally Chen
11
Graduate Institute of Linguistics
National Taiwan University
2
Dept. Appl. Linguistics & Language Studies
Chung Yuan Christian University
{jfon, r94142011, d93142002}@ntu.edu.tw,1 [email protected]
ABSTRACT
This study investigates how onset and sentence positioning affect the realization of Tone 1 in two dialects of Taiwan Mandarin. Results showed that the central dialect was higher in register when placed in isolation, but lower when placed in a sentential context. When there was a tonal mismatch, coarticulatory effects were more robust in the northern dialect, implying that speakers of the central dialect (nonstandard) might be more self-conscious about the standard-vernacular distinction than those of the northern dialect (standard), and overcorrection tended to occur. The effect of onset type was also significant but fairly localized. Obstruent-initial syllables had higher initial pitch than sonorant ones. The declination effect was also significant, the rate of which being higher in the central variety. In addition, sentential stress tended to raise the sentence-final H targets in both varieties. However, the PENTA model was not fully supported.
Keywords: high tone, Taipei Mandarin, Taichung
Mandarin, dialectal variation, tonal realization
1. INTRODUCTION
Taiwan Mandarin (TM) is the official language of Taiwan and is genetically related to Mainland Mandarin (MM), the official language of Mainland China. However, due to almost 60 years of political separation between the two places, the two Mandarins have developed independently so that dialectal variations are obvious to speakers of either variety [6].
Political division is not the only cause for the divergence of the two dialects, however. Ethnic distributions are also different. About 73-80% of the population in Taiwan is Southern Min, who speaks a variant of Mainland Southern Min, and this Min is therefore a powerful substrate language for TM [5, 6].
Both Mandarin varieties have four tones, traditionally termed Tone 1 (T1), Tone 2 (T2), Tone 3 (T3), and Tone 4 (T4), which are realized as high level, mid dipping, mid-low dipping, and high falling, respectively, with T2 having an allophonic variant of mid rising in MM, and T3 a mid-low fall in both varieties [7, 16]. Although the phonological categories of the four tones are the same between the two dialects, the phonetic realizations are somewhat different. Specifically, tonal registers of TM T2 and T3 are much lower and narrower than those of the MM variety [8]. This discrepancy is presumably due to the influence from Min, which seems to prefer a lower frequency range [4, 11]. This study thus planned to see if such a lowering effect is also affecting T1, which is a high tone.
2. AIMS OF THE STUDY
There are four specific aims in this study. First of all, we would like to explore possible dialectal differences in T1 realization. If the degree of Min influence is negatively correlated with tonal register [8, 11], one would expect the tonal targets of TM varieties that are more influenced by Min (i.e., the nonstandard varieties) to be lower than those that are not as influenced (i.e., the standard variety).
Secondly, we would like to see if sentential T1 demonstrates a similar interaction with stress as the other tones. Fon & Hsu [8] showed that when T2 and T3 are placed in sentence-final positions, H targets are realized higher and L targets lower than what would be expected from pure declination. We suspected that the exaggeration in the realization of stress might be due to a sentence-final stress rule [2, 13]. Therefore, we would like to see if such a trend could also be observed in T1. If so, then sentence-final T1s should be realized higher than what would be predicted by the neutral topline.
The third aim is to investigate possible effects of syllable structure on T1 realization. According to Hombert, Ohala, and colleagues [9, 10, 12], voiceless obstruents impose a slight pitch-raising effect on the F0 values. However, this was not found in the realization of T2 and T3 [8]. One possible reason might be the constraints imposed by contour tones. Therefore, we would like to see if level T1 is also impervious to such effects.
Finally, according to the PENTA model [17], the default tonal register in utterance-initial positions should be mid unless otherwise specified (p. 240). Our previous findings [8] could not find affirmative evidence for this claim for TM T2. Thus, we would like to see if such pattern could be observed in T1. The model would predict T1 to be always realized as a mid-to-high rise in isolation and in sentence-initial positions, and as a low-to-high rise in other sentence-internal mismatch positions, as both would be considered as tonal mismatch cases in the PENTA model.
3. METHODS 3.1. Participants
Six subjects between ages 19 and 24 participated in the study. Half were from Taipei (the northern dialect), and half were from Taichung (the central dialect). All of them were ethnically Min, but the Taipei speakers could not speak Min fluently. As this study is still in progress, more subjects will be included when the project is complete.
3.2. Stimuli
27 T1 syllables representative of Mandarin phonotactics were chosen as stimuli, including 6 voiceless obstruent-initials (e.g., [xan] ‘charmingly naive’), 15 sonorant-initials (e.g., [la] ‘pull’), and 6 vowel-initials (e.g., [u] ‘black’). Syllables were also placed in three comparable carrier sentences, so that they occurred in sentence-initial, -medial, and -final positions. Carrier sentences were designed so that syllables immediately before the target in medial and final positions ended mid-low and comparable tonal target clashes would occur in all three positions. In total, 27 (stimuli) × 3 (positions) = 81 sentences were recorded.
3.3. Equipment
Recordings were done using a SONY PCM-M1 Digital Audio Recorder with Maxell R-64 DA 60
min DAT tapes and a SHURE SM10A head-mounted microphone.
3.4. Procedure
Speakers were seated in a quiet room and asked to read out loud the semi-randomized stimuli using natural intonation at a normal rate. The whole process took about 15 minutes. The original recordings had a sampling rate of 48 kHz, which were subsequently downsampled to 16 bit 22050 kHz using Cool Edit Pro 2.00.
3.5. Analyses
The recordings were hand-labeled using Praat 4.4 [1]. A Praat script was written for automatic pitch extraction on the voiced portion of each syllable, which is considered the measurable domain for tones [2, 3, 14, 15]. For obstruent-initial syllables, the starting point of a tone was determined by the onset of the voice bar after the obstruent, which was voiceless in this study. For the rest, the starting point began from the onset of the syllable, as the whole syllable was voiced. The ending point was always the offset of the voice bar. Occasional syllable-initial or -final glottalized portions caused by voice fry were not included for pitch extraction. Extracted pitch tracks were hand-checked and hand-corrected for doubling and halving through pitch period calculation, and were interpolated and smoothed using Praat functions afterwards. A second Praat script was written to extract pitch reference points at ten equal time points.
4. RESULTS 4.1. T1 in isolation
The average F0 for Taipei speakers was 215.05 Hz while that for Taichung speakers was 225.10 Hz. A Dialect (2) × Onset (3) × Extraction (10) three-way mixed ANOVA was performed to test the effect of dialect and onset. Results showed that all of the main effects were significant [Dialect: F(1,156) = 22.89, p < .0001, η^2 = .13; Onset: F(2, 156) = 3.81,
p < .05, η^2 = .05; Extraction: F(2.76, 430.44) = 143.49, p < .0001, η^2 = .48]. Two of the two-way interaction effects involving Extraction were also significant [Dialect × Extraction: F(2.76, 430.44) = 5.08, p < .01, η^2 = .03; Onset × Extraction: F(5.52, 430.44) = 8.21, p < .0001, η^2 = .10]. The three-way interaction was not significant (Figures 1 & 2).
Figure 1: Time-normalized F0 trajectories of isolated T1 in two dialects averaged across onset types.
Post hoc independent t-tests regarding the interaction effect of Dialect and Extraction showed that except for the final extraction point, all other extractions were significant (p < .01 for Point 1, and p < .001 for others). In addition, post hoc pairwise comparisons showed that for northern TM, Points 1, 9, and 10 were the highest in pitch, Points 2 and 8 were the second highest, the remaining points were the lowest (p < .0001). For central TM, Points 1, 9, and 10 were still the highest in pitch, Points 2, 3, and 8 were the next highest, and the remaining points were the lowest (p < .0001).
Figure 2: Time-normalized F0 trajectories of T1 in isolation with regards to onset types.
Post hoc one-way ANOVAs regarding the interaction effect of Onset and Extraction showed that Onset was only significantly different at Points 1 and 7 [Point 1: F(2, 78) = 5.99, p < .01, η^2 = .13; Point 7: F(2, 78) = 3.47, p < .05. η^2 = .08]. For Point 1, post hoc Tukey-b tests showed that obstruent-initial syllables were higher than vowel- and sonorant-initial ones (p < .05). For Point 7, post hoc pairwise comparisons showed that vowel-initial syllables were the highest while sonorant-initial ones were the lowest (p < .05).
4.2. T1 in context
The average F0 values for Taipei speakers were 245.65 Hz, 230.20 Hz, and, 200.71 Hz for the three positions, respectively, while those for Taichung speakers were 244.26 Hz, 224.24 Hz, and 198.93 Hz, respectively. A Dialect (2) × Position (3) × Extraction (10) three-way mixed ANOVA was performed to test the effect of dialect and sentential positioning. Since Onset did not seem to have a very robust effect on isolated T1, it was excluded in the following analyses. Results showed that two of the main effects were significant [Position: F(2, 320) = 730.78, p < .0001, η^2 = .82; Extraction:
F(1.88, 300.22) = 74.60, p < .0001, η^2 = .32]. Two of the two-way interactions involving Extraction were also significant [Dialect × Extraction: F(1.88, 300.22) = 23.67, p < .0001, η^2 = .13; Position × Extraction: F(3.36, 537.44) = 73.66, p < .0001, η^2 = .32]. So was the three-way interaction [F(3.36, 537.44) = 5.80, p < .001, η^2 = .03] (Figure 3).
Figure 3: Time-normalized F0 trajectories of T1 in context. Solid lines represent Taipei speakers, and dashed lines represent Taichung speakers.
Regarding the declination effect, post hoc pairwise comparisons showed that for northern TM, all sentence-initial extraction points were higher than sentence-medial ones, which were in turn higher than sentence-final ones. However, the difference between the latter two positions were much larger than the former two, especially in the final portions of the tone (p < .001 between initial and medial Point 9’s, p < .01 between initial and medial Point 10’s, and p < .0001 for others). For central TM, the overall trend was still the same. Sentence-initial extracts were the highest, and sentence-final ones were the lowest. However, the difference between the former two was much larger than that between the latter two for the final portion of the tone (p < .05 for medial and final Point 10’s, and p < .0001 for others).
comparisons showed that for sentence-medial positions, Taipei T1 was significantly higher than that of Taichung starting from Point 6 to the end of the tone (p < .01 for Point 6, p < .001 for Point 7, and p < .0001 for others). For sentence-final positions, Taipei T1 was lower than Taichung T1 at Point 1 (p < .05), but was significantly higher from Points 5 to 8 (p < .05 for Points 5 & 8, and p < .01 for Points 6 & 7). No difference was found in the initial position.
With regards to tonal contours, Taipei and Taichung initial T1s and Taichung medial T1s were fairly level. Taipei medial and final T1s were rising, while Taichung final T1s were dipping.
5. DISCUSSION
Results in this study showed that variations did exist between the two varieties of TM. In terms of pitch register, the direction went as predicted in sentence-medial and -final positions. Taichung T1 was indeed lower, and the rate of declination faster. However, syllables in isolation showed an opposite trend, in which Taipei T1 was lower. Since reading isolated syllables is more unnatural and thus more formal than reading sentences, we hypothesized that the Taichung speakers, speaking a non-standard dialect, might be unconsciously over-correcting themselves in a more formal register, but were unable to do so in a more relaxed one.
The effect of sentential stress in raising sentence-final H tonal targets was also supported, as can be shown by the bigger rise towards the end of the syllable. The effect of sentential stress thus affects not only contour tones, but also level ones.
Different onset types did have an effect on the realization of tones. Obstruent-initial syllables had slightly higher pitch than sonorant-initial ones. However, the effect was fairly small and localized.
Finally, isolated and sentence-initial T1s were not realized as a rise, which contradicted Xu’s [17] claim of a default mid tonal register. The only contours that conformed to the PENTA model were the medial and final Taipei T1s. However, Taichung tones did not show this effect.
6. CONCLUSION
This study shows that dialectal differences can affect realization of Mandarin T1. Phonotactic composition, while significant, imposes only minor effects. More studies will be needed in order to understand the actual mechanism underlying tonal realization.
7. REFERENCES
[1] Boersma, P., Weenink, D. 2006. Praat: doing phonetics
by computer version 4.4.18. http://www.praat.org/.
[2] Chao, Y.R. 1968. A Grammar of Spoken Chinese. Berkeley: University of California Press.
[3] Chao, Y.R. 1956. Tone, intonation, singsong, chanting, recitative, tonal composition, and atonal composition in Chinese. In: Halle, M. (ed), For
Roman Jakobson, The Hague: Mouton, 52-59.
[4] Chen, S.H. 2005. The effects of tones on speaking frequency and intensity ranges in Mandarin and Min dialects. Journal of the Acoustical Society of
America 117(5), 3225-3230.
[5] Chen, Y.-D. 1989. Taiwan de Kejiaren [Hakka on
Taiwan]. Taipei: Taiyuan Press.
[6] Cheng, R.L. 1985. A comparison of Taiwanese, Taiwan Mandarin, and Peking Mandarin. Language 61(2), 352-377.
[7] Fon, J., Chiang, W.-Y. 1999. What does Chao have to say about tones? -A case study of Taiwan Mandarin.
Journal of Chinese Linguistics 27(1), 15-37.
[8] Fon, J., Hsu, H. 2007. Positional and phonotactic effects on the realization of dipping tones in Taiwan Mandarin. In: Gussenhoven, C. and Riad, T. (eds),
Phonology and phonetics, Tones and tunes: Phonetic and behavioural studies in word and sentence prosody. Vol. 2, 239-269.
[9] Hombert, J.-M. 1978. Consonant types, vowel quality, and tone. In: Fromkin, V. A. (ed), Tone: A
Linguistic Survey, New York: Academic Press,
77-111.
[10]Hombert, J.-M., Ohala, J.J., Ewan, W.G. 1979. Phonetic explanation for the development of tones. Language 55(1), 37-58.
[11]Hsu, S.-y. 2006. The tonal variation of Mandarin Tone 2
in Taiwan: A phonetic study on Taiwanese-Taiwan Mandarin bilinguals and Taiwan Mandarin monolinguals. National Chengchi University.
[12]Ohala, J.J. 1973. The physiology of tone. In: Hyman, L. M. (ed), Consonant Types and Tone: Southern
California Occasional Papers in Linguistics. Vol. 1,
Los Angeles: University of Southern California, 2-14.
[13]Peng, S.-h., Chan, M.K.M., Tseng, C.-y., Huang, T., Lee, O.J., Beckman, E.M. 2005. Towards a pan-Mandarin system for prosodic transcription. In: Jun, S.-A. (ed), Prosodic Typology: The phonology of
intonation and phrasing, New York: Oxford
University Press, 230-270.
[14]Wang, W., S-Y. 1967. Phonological features of tone.
International Journal of American Linguistics 33(2),
93 - 105.
[15]Xu, Y. 1998. Consistency of tone-syllable alignment across different syllable structures and speaking rates. Phonetica: International Journal of Speech
Science 55(4), 179-203.
[16]Xu, Y. 1997. Contextual tonal variations in Mandarin.
Journal of Phonetics 25(1), 61-83.
[17]Xu, Y. 2005. Speech melody as articulatorily implemented communicative functions. Speech
出席國際學術會議心得報告
計畫編號
95-2411-H-002-046-
計畫名稱
地域及語體對台灣地區國語一聲調閾之影響
出國人員姓名
服務機關及職稱
馮怡蓁
會議時間地點
96/8/6-10 德國薩爾布魯根市
會議名稱
第十六屆國際語音科學會議
發表論文題目
(1) The effect of onset and position in the realization of Tone 1 in two dialects
of Taiwan Mandarin
(2) The effect of acquisition order and word relatedness on codeswitching costs
in balanced bilingual speakers
(3) The effect of incredulity and particle on the intonation of yes/no questions
in Taiwan Mandarin
(4) The effects of phonetic distance, learning context and learner proficiency on
L2 perception of English liquids
一、參加會議經過
本次會議共為期五天,自
8/6至8/10。本次的語音相關主題包括言談對話、口語韻律、語
音與非語言訊號的鏈結、發聲語音學、聽辨語音學、生理及病理語音學、語音及多語言訊號
處理、語音傳播、語言/方言辨識、語音科技之應用與其評估等等。內容相當豐富而多元,
亦有許多資深語音學者如
Ann Cutler、John Local、Chih-Lin Shih等與會,對於年輕後輩有相當
多的啟發與鼓勵。
二、與會心得
國際語音科學會議一向以鼓勵學者整合語音、心理及電腦科技等相關領域,以創新而嚴
謹的研究方法研究韻律相關課題。此次參加會議,本人與其他與會者有相當多的接觸與討論,
著實獲益良多。本人此次所發表的四篇論文,亦得到許多知名相關學者中肯的建議,為日後
研究的課題,提供許多寶貴的方向。
THE EFFECT OF ONSET AND POSITION IN THE REALIZATION OF
TONE 1 IN TWO DIALECTS OF TAIWAN MANDARIN
Janice Fon,
1Hui-ju Hsu,
2Yi-Hsuan Huang,
1& Sally Chen
11
Graduate Institute of Linguistics
National Taiwan University
2
Dept. Appl. Linguistics & Language Studies
Chung Yuan Christian University
{jfon, r94142011, d93142002}@ntu.edu.tw,1 [email protected]
ABSTRACT
This study investigates how onset and sentence positioning affect the realization of Tone 1 in two dialects of Taiwan Mandarin. Results showed that the central dialect was higher in register when placed in isolation, but lower when placed in a sentential context. When there was a tonal mismatch, coarticulatory effects were more robust in the northern dialect, implying that speakers of the central dialect (nonstandard) might be more self-conscious about the standard-vernacular distinction than those of the northern dialect (standard), and overcorrection tended to occur. The effect of onset type was also significant but fairly localized. Obstruent-initial syllables had higher initial pitch than sonorant ones. The declination effect was also significant, the rate of which being higher in the central variety. In addition, sentential stress tended to raise the sentence-final H targets in both varieties. However, the PENTA model was not fully supported.
Keywords: high tone, Taipei Mandarin, Taichung
Mandarin, dialectal variation, tonal realization
1. INTRODUCTION
Taiwan Mandarin (TM) is the official language of Taiwan and is genetically related to Mainland Mandarin (MM), the official language of Mainland China. However, due to almost 60 years of political separation between the two places, the two Mandarins have developed independently so that dialectal variations are obvious to speakers of either variety [6].
Political division is not the only cause for the divergence of the two dialects, however. Ethnic distributions are also different. About 73-80% of the population in Taiwan is Southern Min, who speaks a variant of Mainland Southern Min, and this Min is therefore a powerful substrate language for TM [5, 6].
Both Mandarin varieties have four tones, traditionally termed Tone 1 (T1), Tone 2 (T2), Tone 3 (T3), and Tone 4 (T4), which are realized as high level, mid dipping, mid-low dipping, and high falling, respectively, with T2 having an allophonic variant of mid rising in MM, and T3 a mid-low fall in both varieties [7, 16]. Although the phonological categories of the four tones are the same between the two dialects, the phonetic realizations are somewhat different. Specifically, tonal registers of TM T2 and T3 are much lower and narrower than those of the MM variety [8]. This discrepancy is presumably due to the influence from Min, which seems to prefer a lower frequency range [4, 11]. This study thus planned to see if such a lowering effect is also affecting T1, which is a high tone.
2. AIMS OF THE STUDY
There are four specific aims in this study. First of all, we would like to explore possible dialectal differences in T1 realization. If the degree of Min influence is negatively correlated with tonal register [8, 11], one would expect the tonal targets of TM varieties that are more influenced by Min (i.e., the nonstandard varieties) to be lower than those that are not as influenced (i.e., the standard variety).
Secondly, we would like to see if sentential T1 demonstrates a similar interaction with stress as the other tones. Fon & Hsu [8] showed that when T2 and T3 are placed in sentence-final positions, H targets are realized higher and L targets lower than what would be expected from pure declination. We suspected that the exaggeration in the realization of stress might be due to a sentence-final stress rule [2, 13]. Therefore, we would like to see if such a trend could also be observed in T1. If so, then sentence-final T1s should be realized higher than what would be predicted by the neutral topline.
The third aim is to investigate possible effects of syllable structure on T1 realization. According to Hombert, Ohala, and colleagues [9, 10, 12], voiceless obstruents impose a slight pitch-raising effect on the F0 values. However, this was not found in the realization of T2 and T3 [8]. One possible reason might be the constraints imposed by contour tones. Therefore, we would like to see if level T1 is also impervious to such effects.
Finally, according to the PENTA model [17], the default tonal register in utterance-initial positions should be mid unless otherwise specified (p. 240). Our previous findings [8] could not find affirmative evidence for this claim for TM T2. Thus, we would like to see if such pattern could be observed in T1. The model would predict T1 to be always realized as a mid-to-high rise in isolation and in sentence-initial positions, and as a low-to-high rise in other sentence-internal mismatch positions, as both would be considered as tonal mismatch cases in the PENTA model.
3. METHODS 3.1. Participants
Six subjects between ages 19 and 24 participated in the study. Half were from Taipei (the northern dialect), and half were from Taichung (the central dialect). All of them were ethnically Min, but the Taipei speakers could not speak Min fluently. As this study is still in progress, more subjects will be included when the project is complete.
3.2. Stimuli
27 T1 syllables representative of Mandarin phonotactics were chosen as stimuli, including 6 voiceless obstruent-initials (e.g., [xan] ‘charmingly naive’), 15 sonorant-initials (e.g., [la] ‘pull’), and 6 vowel-initials (e.g., [u] ‘black’). Syllables were also placed in three comparable carrier sentences, so that they occurred in sentence-initial, -medial, and -final positions. Carrier sentences were designed so that syllables immediately before the target in medial and final positions ended mid-low and comparable tonal target clashes would occur in all three positions. In total, 27 (stimuli) × 3 (positions) = 81 sentences were recorded.
3.3. Equipment
Recordings were done using a SONY PCM-M1 Digital Audio Recorder with Maxell R-64 DA 60
min DAT tapes and a SHURE SM10A head-mounted microphone.
3.4. Procedure
Speakers were seated in a quiet room and asked to read out loud the semi-randomized stimuli using natural intonation at a normal rate. The whole process took about 15 minutes. The original recordings had a sampling rate of 48 kHz, which were subsequently downsampled to 16 bit 22050 kHz using Cool Edit Pro 2.00.
3.5. Analyses
The recordings were hand-labeled using Praat 4.4 [1]. A Praat script was written for automatic pitch extraction on the voiced portion of each syllable, which is considered the measurable domain for tones [2, 3, 14, 15]. For obstruent-initial syllables, the starting point of a tone was determined by the onset of the voice bar after the obstruent, which was voiceless in this study. For the rest, the starting point began from the onset of the syllable, as the whole syllable was voiced. The ending point was always the offset of the voice bar. Occasional syllable-initial or -final glottalized portions caused by voice fry were not included for pitch extraction. Extracted pitch tracks were hand-checked and hand-corrected for doubling and halving through pitch period calculation, and were interpolated and smoothed using Praat functions afterwards. A second Praat script was written to extract pitch reference points at ten equal time points.
4. RESULTS 4.1. T1 in isolation
The average F0 for Taipei speakers was 215.05 Hz while that for Taichung speakers was 225.10 Hz. A Dialect (2) × Onset (3) × Extraction (10) three-way mixed ANOVA was performed to test the effect of dialect and onset. Results showed that all of the main effects were significant [Dialect: F(1,156) = 22.89, p < .0001, η^2 = .13; Onset: F(2, 156) = 3.81,
p < .05, η^2 = .05; Extraction: F(2.76, 430.44) = 143.49, p < .0001, η^2 = .48]. Two of the two-way interaction effects involving Extraction were also significant [Dialect × Extraction: F(2.76, 430.44) = 5.08, p < .01, η^2 = .03; Onset × Extraction: F(5.52, 430.44) = 8.21, p < .0001, η^2 = .10]. The three-way interaction was not significant (Figures 1 & 2).
Figure 1: Time-normalized F0 trajectories of isolated T1 in two dialects averaged across onset types.
Post hoc independent t-tests regarding the interaction effect of Dialect and Extraction showed that except for the final extraction point, all other extractions were significant (p < .01 for Point 1, and p < .001 for others). In addition, post hoc pairwise comparisons showed that for northern TM, Points 1, 9, and 10 were the highest in pitch, Points 2 and 8 were the second highest, the remaining points were the lowest (p < .0001). For central TM, Points 1, 9, and 10 were still the highest in pitch, Points 2, 3, and 8 were the next highest, and the remaining points were the lowest (p < .0001).
Figure 2: Time-normalized F0 trajectories of T1 in isolation with regards to onset types.
Post hoc one-way ANOVAs regarding the interaction effect of Onset and Extraction showed that Onset was only significantly different at Points 1 and 7 [Point 1: F(2, 78) = 5.99, p < .01, η^2 = .13; Point 7: F(2, 78) = 3.47, p < .05. η^2 = .08]. For Point 1, post hoc Tukey-b tests showed that obstruent-initial syllables were higher than vowel- and sonorant-initial ones (p < .05). For Point 7, post hoc pairwise comparisons showed that vowel-initial syllables were the highest while sonorant-initial ones were the lowest (p < .05).
4.2. T1 in context
The average F0 values for Taipei speakers were 245.65 Hz, 230.20 Hz, and, 200.71 Hz for the three positions, respectively, while those for Taichung speakers were 244.26 Hz, 224.24 Hz, and 198.93 Hz, respectively. A Dialect (2) × Position (3) × Extraction (10) three-way mixed ANOVA was performed to test the effect of dialect and sentential positioning. Since Onset did not seem to have a very robust effect on isolated T1, it was excluded in the following analyses. Results showed that two of the main effects were significant [Position: F(2, 320) = 730.78, p < .0001, η^2 = .82; Extraction:
F(1.88, 300.22) = 74.60, p < .0001, η^2 = .32]. Two of the two-way interactions involving Extraction were also significant [Dialect × Extraction: F(1.88, 300.22) = 23.67, p < .0001, η^2 = .13; Position × Extraction: F(3.36, 537.44) = 73.66, p < .0001, η^2 = .32]. So was the three-way interaction [F(3.36, 537.44) = 5.80, p < .001, η^2 = .03] (Figure 3).
Figure 3: Time-normalized F0 trajectories of T1 in context. Solid lines represent Taipei speakers, and dashed lines represent Taichung speakers.
Regarding the declination effect, post hoc pairwise comparisons showed that for northern TM, all sentence-initial extraction points were higher than sentence-medial ones, which were in turn higher than sentence-final ones. However, the difference between the latter two positions were much larger than the former two, especially in the final portions of the tone (p < .001 between initial and medial Point 9’s, p < .01 between initial and medial Point 10’s, and p < .0001 for others). For central TM, the overall trend was still the same. Sentence-initial extracts were the highest, and sentence-final ones were the lowest. However, the difference between the former two was much larger than that between the latter two for the final portion of the tone (p < .05 for medial and final Point 10’s, and p < .0001 for others).
As for dialectal differences, post hoc pairwise