• 沒有找到結果。

中文使用者對英語字重音和純音的感知:探討基本頻率、持續時間以及音量三方面的重要性

N/A
N/A
Protected

Academic year: 2021

Share "中文使用者對英語字重音和純音的感知:探討基本頻率、持續時間以及音量三方面的重要性"

Copied!
133
0
0

加載中.... (立即查看全文)

全文

(1)國立臺灣師範大學英語學系 碩 士. 論 文. Master’s Thesis Department of English National Taiwan Normal University. 中文使用者對英語字重音和純音的感知:探討基 本頻率、持續時間以及音量三方面的重要性. Mandarin Speakers’ Perception of English Stress and Pure Tone: An Investigation on the F0, Duration and Amplitude. 指導教授:甯俐馨博士 Advisor: Dr. Li-Hsin Ning 研究生:郭健民 Student: Chien-Min Kuo. 中華民國一百零七年七月 July, 2018.

(2) 摘要 本研究探討中文母語使用者對英語字重音的感知,以及重音感知與純音感知 兩者間的關聯。研究參與者為兩組英文程度不同的中文母語使用者和一組英文母 語使用者,每位參與者皆須參加重音辨別測驗以及純音辨別測驗。此研究使用一 個雙音節的虛構詞來進行重音辨別測驗。此虛構詞有名詞以及動詞兩種形式。名 詞的重音在第一音節,動詞的重音在第二音節。名詞和動詞的重音音節皆經過後 製,使得參與者每次聽到的重音音節都不太一樣。重音音節的後製是針對基礎頻 率、持續時間以及音量三方面來進行調整。參與者必須回答他們每一次聽到的重 音位置,作答時必須使用「五點李克特量尺」(five-point Likert scale)來回答他們 確定的程度。進行純音辨別測驗時,參與者每次會聽到兩個純音,這些純音的在 基礎頻率、持續時間以及音量三方面的變化與虛構詞的動詞及名詞形式相同。參 與者必須判別每次聽到的兩個純音是否相同,作答時必須使用「三點李克特量尺」 (three-point Likert scale)來回答他們確定的程度。 本研究發現英語母語使用以及中文母語使用者的重音感知並沒有明顯的差 異,而兩組中文母語使用者的重音感知也沒有明顯的差異。三組參與者都是利用 基礎頻率來判別名詞的重音,而動詞的重音則都是利用持續時間來判別。被用來 判別重音的聲學特徵都是最明顯的聲學特徵。在本研究的名詞裡,最明顯的聲學 特徵是基礎頻率,而在本研究的動詞裡,最明顯的聲學特徵是持續時間。本研究 也發現參與者利用基礎頻率和持續時間來判別兩個純音是否相同。在純音感知方 面的表現和在重音感知方面的表現有些類似之處:在辨別名詞的重音以及辨別與 名詞類似的純音時,基礎頻率都是較為重要的聲學特徵,而在辨別動詞的重音以 及辨別與動詞類似的純音時,持續時間都是較為重要的聲學特徵。這些純音感知 與重音感知兩者間的類似之處意味著兩者或許相關。另外,本研究也發現英語母 語使用者在判別重音時,顯得較不確定,而能有這項發現是因為本研究使用「李 克特量尺」(Likert scale)來進行測驗。除此之外,本研究也發現,進行重音辨別 測驗時,在聽過基礎頻率、持續時間以及音量三方面的變化組合後,參與者顯得 i.

(3) 對重音判別變得更加確定。. 關鍵詞:英語字重音、純音、感知、基礎頻率、持續時間、音量、第二語習得. ii.

(4) ABSTRACT This study examined Mandarin speakers’ perception of English lexical stress and the relation between stress perception and pure tone perception. One group of native English speakers and two groups of Mandarin speakers with two different English proficiency levels were asked to take a stress perception test and a pure tone perception test. A disyllabic nonsense word was pronounced as a noun form with stress on the first syllable and as a verb form with stress on the second syllable. The two forms were synthesized into tokens varying in F0, duration and amplitude in the stressed syllable. In the stress perception test, the participants were required to locate the stress of the token they heard each time. The participants had to respond on a five-point Likert-scale based on their certainty about the location of the stress. In the pure tone perception test, each time, one pattern consisting of two pure tones was played to the participant. The pure tone patterns were synthesized to vary in F0, duration and amplitude in the way similar to the noun and verb forms of the nonsense word. The participants had to respond on a three-point Likert scale based on their certainty about the difference between the two tones.. iii.

(5) It was found that the Mandarin speakers’ use of acoustic cues for stress did not differ from English speakers. The two groups of Mandarin speakers with different English proficiency levels also did not differ in their use of acoustic cues for stress. The participants used F0 as a cue for stress in noun and duration as a cue in verb. It appeared that among the three acoustic cues (i.e., F0, duration and amplitude), the cue used for stress perception were the most salient one. F0 was the most salient cue in the noun form, while duration was the most salient one in the verb form. In the pure tone perception test, the three groups used both F0 and duration as cues for both sets of pure tones (i.e., one set synthesized based on the noun form and the other based on the verb form). Some similarities were found between pure tone perception and stress perception. In the perception of the noun and that of the pure tones synthesized based on the noun, F0 was a more important acoustic cue than duration. In the perception of the verb and that of the pure tones synthesized based on the verb, duration was a more important acoustic cue than F0. The similarities between pure tone perception and stress perception indicate that pure tone perception might be related to stress perception. In addition to the major findings above, this study has also found that English speakers tended to be less certain about their stress perception in an experimental scenario, which was revealed by the Likert scale test adopted in this. iv.

(6) study. In addition, it was found that the exposure to stress patterns varying in F0, duration and amplitude might facilitate stress perception.. Keywords: English lexical stress, pure tone, perception, F0, duration, amplitude, second language acquisition. v.

(7) ACKNOWLEDGEMENTS Writing this thesis was not easy. However, were it not for the help and support from many people, the process could have been way more difficult. I would like to thank my advisor Dr. Li-Hsin Ning for everything she brought to my graduate study, including inspiring me to study phonetics, introducing me to the world of statistics and computer programming, offering me a lab assistant job, encouraging me to go to conferences, and reading my thesis carefully and providing advice that greatly and efficiently improved my thesis. The things I learned from Dr Ning are helpful for not only this thesis but also my graduate study in general. My thanks also go to my committee members Dr. Yung-Hsiang Shawn Chang and Dr. Chenhao Chiu. They read through my thesis carefully and walked me through the issues I had in my study. With their help, I was able to see the problems that I was not aware of, and I also learned the ideas that I was not able to come up with by myself. The process of finding participants for my experiment was laborious at first. However, with the assistance from Dr. Alvin Cheng-Hsien Chen, Dr. Li-Hsin Ning, Dr. Lindsey Nai-Hsien Chen and Dr. Jing-lan Joy Wu, I was able to speed up the process. I am grateful that they generously let me look for my experiment participants in their classes.. vi.

(8) Every course I took in my graduate study is crucial for developing my capacity in research. Therefore, I would like to show my appreciation for Dr. Chun-yin Doris Chen, Dr. Hui-shan Lin, Dr. Hsiao-hung Iris Wu, Dr. Jen-i Li, Dr. Li-Hsin Ning, Dr. Miao-Ling Hsieh, Dr. Lindsey Nai-Hsien Chen, Dr. Jing-lan Joy Wu and Dr. Shiao-hui Chan. Their thought-provoking courses sharpened my mind. The last year of my graduate study would have been a lot tougher without my fellow students’ support. My thanks go to Stephanie Yeh, Mark Tu and Eileen Lin. I had a wonderful time hanging out with you guys. My thanks also go to Eric K. Ku for his input in my study. Finally, I would like to thank my parents and sister for their unconditional love and support. I do not always talk my feelings out, but they can always know when I am stressed out, and they always try to be as supportive and understanding as possible.. vii.

(9) TABLE OF CONTENTS CHINESE ABSTRACT .............................................................................................. i ENGLISH ABSTRACT ........................................................................................... iii ACKNOWLEDGEMENTS ...................................................................................... vi TABLE OF CONTENTS ........................................................................................ viii LISTS OF TABLES ................................................................................................. xi LIST OF FIGURES ................................................................................................. xii Chapter One: Introduction ......................................................................................... 1 1.1 Background and Motivation ......................................................................... 1 1.2 Research Questions ...................................................................................... 4 1.3 Organization of the Study ............................................................................ 4 Chapter Two: Literature Review ................................................................................ 6 2.1 English Lexical Stress .................................................................................. 6 2.1.1 Fundamental Frequency .................................................................... 6 2.1.2 Duration ............................................................................................ 8 2.1.3 Amplitude ....................................................................................... 10 2.1.4 Viewing English Lexical Stress as a Complex of Parameters ............ 11 2.1.5 Summary ......................................................................................... 13 2.2 Mandarin Lexical Tones ............................................................................. 13 2.2.1 Fundamental Frequency .................................................................. 14 2.2.2 Duration .......................................................................................... 16 2.2.3 Amplitude ....................................................................................... 17 2.2.4 The Dominance of F0 in Mandarin Tones ........................................ 18 2.3 Comparison between English Lexical Stress and Mandarin Lexical Tone ... 20 2.4 Previous Research on Mandarin Speakers’ Acquisition of English Lexical Stress ............................................................................................................... 21 2.4.1 Production Studies .......................................................................... 22 2.4.1.1 Lai (2008) ............................................................................. 22 2.4.1.2 Zhang et al. (2008)................................................................ 24 2.4.1.3 Tseng et al. (2013) ................................................................ 25 2.4.1.4 Summary .............................................................................. 26 2.4.2 Perception Studies ........................................................................... 26 2.4.2.1 Lai (2008) ............................................................................. 26 2.4.2.2 Wang (2008) ......................................................................... 27 2.4.2.3 Zhang & Francis (2010) ........................................................ 28 2.4.2.4 Ou (2010) ............................................................................. 29 viii.

(10) 2.4.2.5 Chrabaszcz et al. (2014) ........................................................ 30 2.4.2.5 Summary .............................................................................. 31 2.4.3 Gaps for Further Research ............................................................... 31 2.5 The Relation between Musical Perception and L2 Perception ..................... 33 2.5.1 The Influence of Musical Perception on Linguistic Perception......... 34 2.5.2 The Influence of Linguistic Perception on Musical Perception......... 36 2.5.3 Summary ......................................................................................... 40 Chapter Three: Research Design .............................................................................. 41 3.1 Participants ................................................................................................ 41 3.2 Materials .................................................................................................... 42 3.2.1 The Resynthesized Disyllabic Nonsense Words ............................... 42 3.2.2 The Synthesized Pure Tones ............................................................ 47 3.3 Procedures ................................................................................................. 52 3.4 Data Analysis ............................................................................................. 55 Chapter Four: Results .............................................................................................. 57 4.1 Stress perception test.................................................................................. 57 4.1.1 The noun form perception ............................................................... 57 4.1.2 The verb form perception ................................................................ 61 4.2 Pure tone perception test ............................................................................ 64 4.2.1 Set 1 ................................................................................................ 64 4.2.1.1 F0 x DURATION.................................................................. 67 4.2.1.2 DURATION x GROUP......................................................... 70 4.2.1.3 F0 x GROUP ........................................................................ 73 4.2.1.4 F0 x DURATION x GROUP ................................................. 75 4.2.2 Set 2 ................................................................................................ 83 4.2.2.1 F0 x DURATION.................................................................. 86 4.2.2.2 DURATION x GROUP......................................................... 89 4.2.2.3 F0 x GROUP ........................................................................ 92 4.2.2.4 F0 x DURATION x GROUP ................................................. 95 Chapter Five: Discussion ........................................................................................104 5.1 Stress perception test.................................................................................104 5.2 Pure tone perception test .........................................................................108 Chapter Six: Conclusion ......................................................................................... 113. ix.

(11) Bibliography .......................................................................................................... 117. x.

(12) LISTS OF TABLES Table 1: The acoustic measurements of the nonsense word prodawn ................. 43 Table 2: The acoustic settings for resynthesizing the noun form PROdawn ........ 46 Table 3: The acoustic settings for resynthesizing the verb form proDAWN ........ 46 Table 4: The acoustic settings for resynthesizing the pure tones in Set 1 ............ 51 Table 5: The acoustic settings for resynthesizing the pure tones in Set 2 ............ 51 Table 6: Examining the simple main effect in F0 x DURATION for Set 1 ......... 69 Table 7: Examining the simple main effect in DURATION x GROUP for Set 1 72 Table 8: Examining the simple main effect in F0 x GROUP for Set 1 ................ 75 Table 9: Examining the simple main effect in F0 x DURATION x GROUP for Set 1 ....................................................................................................................... 77 Table 10: Examining the simple main effect in F0 x DURATION for Set 2 ....... 89 Table 11: Examining the simple main effect in DURATION x GROUP for Set 2 ......................................................................................................................... 92 Table 12: Examining the simple main effect in F0 x GROUP for Set 2 .............. 95 Table 13: Examining the simple main effect in F0 x DURATION x GROUP for Set 2.................................................................................................................. 97 Table 14: The saliency of each parameter .........................................................105. xi.

(13) LIST OF FIGURES Figure 1: Illustration of the synthesized noun forms .......................................... 45 Figure 2: Illustration of Set 1’s synthesized pure tones ...................................... 50 Figure 3: A demonstration of each trial in the stress perception test ................... 54 Figure 4: The mean RV (noun-likeness) for each noun stimulus in each group .. 59 Figure 5: The mean RV (verb-likeness) for each verb stimulus in each group .... 62 Figure 6: The mean RV for each stimulus in Set 1 in each group ....................... 65 Figure 7: The mean RV for each stimulus in Set 2 in each group ....................... 84. xii.

(14) Chapter One: Introduction 1.1 Background and Motivation This study investigates Mandarin speakers’ acquisition of English lexical stress. Mandarin speakers tend to have difficulty learning English stress.1 Several studies have found that Mandarin speakers’ perception or production of English stress differed from English speakers in terms of acoustic parameters, such as pitch (Lai, 2008; Wang, 2008; Zhang et al., 2008; Tseng et al., 2013), duration (Wang, 2008) and amplitude (Lai, 2008; Wang, 2008; Zhang & Francis, 2010). Each of the previous studies on Mandarin speakers’ perception of English stress (Lai, 2008; Wang, 2008; Zhang & Francis, 2010) has not explored at least one of the following important areas: using a Likert scale test to examine stress perception, investigating between-proficiency-group differences, and discussing stress perception in relation to a factor other than L1 influence, such as the ability in perceiving musical patterns. First, the previous studies adopted forced-choice tests to examine the participants’ stress perception (Lai, 2008; Wang 2008; Zhang & Francis, 2010). The participants must judge a syllable as either stressed or unstressed, even when they had difficulty recognizing the stress, so the experiments might not authentically reveal the. Since this study focuses on English lexical stress, the terms stress and English stress throughout this paper all refer to English lexical stress. 1. 1.

(15) participants’ perception. If an experiment participant has no idea of what stress is and is just simply guessing in every trial, the statistics would reveal a performance at chance level. A forced-choice test cannot reflect how confident the participant is in the judgment. Therefore, a Likert scale is needed for the participants to rate how much a syllable sounds like a stressed one, in order to capture the participants’ fine-grained perception. Second, most of the previous studies included only one group of Mandarin speakers and discussed their performance as a group without considering between-group differences in English proficiency (Wang 2008; Zhang & Francis, 2010). A discussion about the differences in L2 proficiency might reflect whether stress perception is learnable. Therefore, it is necessary to investigate between-proficiency-group differences. Third, when discussing the difference of stress perception between Mandarin and English speakers, the previous studies (Wang, 2008; Zhang & Francis, 2010) attributed the difference to L1 influence. More specifically, the tonal system in Mandarin is different from the stress system in English, and Mandarin speakers’ perception of English stress differs from English speakers because they transfer Mandarin tone perception into English stress perception. However, L1 influence might fail to explain Mandarin speakers’ performance sometimes. Chrabaszcz et al.. 2.

(16) (2014) predicted that Mandarin speakers’ weighting of acoustic parameters in stress perception would be different from English speakers’, due to the L1 influence from Mandarin tone perception. However, the results turned out to be that both speakers had the same weighting. Therefore, it is necessary to consider factors other than L1 influence, such as the experience in musical note perception. In the second language acquisition studies of other languages, the perception of speech sounds has been discussed in relation to musical experience. Boll-avetisyan et al. (2016) found that French speakers with more musical experiences were able to have more German-like perception of German stress. There seems to be a relation between musical ability and stress perception. Based on this language-music relation, it seems that listeners’ stress perception would be related to their musical perception. Therefore, this study will investigate the listeners’ stress perception as well as musical perception, in order to see if their stress perception and musical perception are similar. This study uses pure tone patterns to approach the listeners’ musical perception. 2 Though those previous studies have demonstrated and discussed Mandarin speakers’ problems in stress perception with regard to acoustic parameters, there is still plenty of room for further research due to the limitations in their test type, participant grouping and discussion of the factor related to stress perception.. 2. Section 3.2.2 will further elaborate on the choice of pure tone patterns. 3.

(17) To compensate for the three limitations listed above, the current study will adopt Likert scale rating in the test, examine between-proficiency-group differences, and investigate the participants’ perception of pure tone patterns for the discussion of the factor related to stress perception.. 1.2 Research Questions This study will look into Mandarin speakers’ stress perception in the following three aspects: 1. How is Mandarin speakers’ perception of English stress different from English speakers in terms of pitch, duration and amplitude? 2. How does Mandarin speakers’ perception of English stress correspond to their English proficiency? 3. How does Mandarin speakers’ perception of English stress correlates with their ability in perceiving musical patterns?. 1.3 Organization of the Study Chapter 2 will review the literature on the stress system of English, the tonal system of Mandarin, the acquisition of English stress by Mandarin speakers, the relation between the perception of linguistic pattern and the perception of musical. 4.

(18) pattern. Chapter 3 will present the methodologies of the stress perception test and the pure tone perception test. Chapter 4 will illustrate the results of the perception tests based on the three issues in question: the difference between Mandarin and English speakers, the relation between stress perception and L2 experience, and the relation between stress perception and musical perception. Chapter 5 will discuss the test results and Chapter 6 will make a conclusion.. 5.

(19) Chapter Two: Literature Review A review on English lexical stress based on acoustic parameters will be provided in Section 2.1, and a review on Mandarin lexical tones with regard to acoustic parameters will be presented in Section 2.2. In Section 2.3, there will be a comparison between English lexical stress and Mandarin lexical tones. Section 2.4 will review the previous research on Mandarin speakers’ acquisition of English lexical stress. Finally, the relation between the perception of linguistic patterns and the perception of musical patterns will be reviewed in Section 2.5.. 2.1 English Lexical Stress This section will examine the differences between stressed and unstressed syllables in terms of three acoustic parameters: fundamental frequency (Section 2.1.1), duration (Section 2.1.2), and amplitude (Section 2.1.3). The idea of viewing stress as a complex of acoustic parameters will also be presented (Section 2.1.4).. 2.1.1 Fundamental Frequency Previous research has shown that English stressed syllables differ from unstressed ones in terms of fundamental frequency (abbreviated as “F0” hereafter) in both production and perception.. 6.

(20) Stressed syllables in English are produced with higher F0s than the unstressed counterparts. Lieberman (1960) investigated English speakers’ production of words that contrasted in the position of stress. Comparing the stressed and unstressed syllable within a word, he found that among 90% of the words, the stressed syllable had a higher peak F0 than the unstressed syllable. Comparing the stressed and unstressed syllable across words with the same spelling but contrasting stress positions (e.g., comparing the first syllable of PERfect3 with the first syllable of perFECT ), Lieberman found that 72% of the compared cases had the higher peak F0 in the stressed syllable than in the unstressed. On the other hand, Beckman (1986) examined English speakers’ production of words with contrasting stress positions and calculated the difference in peak F0 between the two syllables in a word. The F0 differences in trochaic words were negative values, which resulted from the greater F0 values for the first syllable than for the second. However, the F0 differences in iambic words were positive values because the first syllable had smaller F0 values than the second. The relatively higher F0 in a stressed syllable enables listeners to identify the position of stress in a word. Fry (1958) conducted a test on English speakers’ stress perception in synthetic disyllabic English words with various level F0s in the target. 3. The stressed syllables are represented by uppercase letters in this thesis. 7.

(21) syllables. A step-up F0 change by at least 5 Hz in a test word (e.g., 97 Hz in the first syllable and 102 Hz in the second syllable) was able to lead to an iambic perception. On the other hand, a step-down F0 change by at least 5 Hz could induce a trochaic perception. Morton & Jassem (1965) examined English speakers’ stress perception of disyllabic nonsense words. It was found that the words with a higher level or downward-sloping F0 in the first syllable gave rise to trochaic perception, whereas the words that had an upward-sloping or higher level F0 in the second syllable elicited iambic perception. To sum up, a stressed syllable is higher in F0 than its unstressed counterpart in both production and perception.. 2.1.2 Duration The durations of stressed syllables are produced to be longer than unstressed syllables. Lieberman (1960) measured the durations of each syllable produced by the English speakers in his study. When the stressed and unstressed syllables from a word were compared, 66% of the words had a longer duration in the stressed syllable than the unstressed one. When the stressed syllable in one word was compared with the unstressed syllable in the other word that had the same spelling but a different stress position, 70% of the compared cases had a longer duration in the stressed syllable. 8.

(22) than the unstressed. Beckman (1986) investigated the syllable durations of disyllabic words produced by English speakers and calculated the ratio of the second syllable’s duration to the first syllable’s duration. The ratios in the iambic words were found to be greater than the trochaic words, which was due to the longer durations of stressed syllables. The comparatively longer duration of a stressed syllable serves as a cue for recognizing where the stress is placed. Fry (1958) looked into the relation between duration ratio and the ratio of noun judgment. The duration ratio was calculated by the first vowel’s duration divided by the second vowel’s duration, and the ratio of noun judgment referred to the percentage of listeners recognizing the stress in the first syllable of a disyllabic word. It was found that the higher the duration ratio was (i.e., the longer the first vowel was), the larger the percentage of noun judgment became. Morton and Jassem (1965) examined how the variation of syllable duration correlated with the recognition of stress position. When the first syllable in a disyllabic word became longer, the listeners were more likely to perceive the first syllable as the stressed one. When the second syllable got longer, the listeners tended to spot the stress in the second syllable. In conclusion, stressed syllables had longer durations than the unstressed counterparts, which is not only realized in production but also identifiable in. 9.

(23) perception.. 2.1.3 Amplitude Stressed syllables are produced with higher amplitudes than the unstressed counterparts. Lieberman (1960) tracked the peak amplitudes of the syllables recorded from his English-speaking participants. When inspecting the syllables from the same word, among 87% of the words, the stressed syllables were found to have higher amplitudes than the unstressed ones. When a stressed syllable was compared with its unstressed version in another word, it was found that 90% of the compared cases had a higher amplitude in the stressed syllable. Beckman (1986) noted down the peak amplitudes of every English syllable in her study. The difference of peak amplitude between the two syllables in the disyllabic words was calculated by subtracting the first syllable’s peak amplitude from that of the second syllable. It turned out that when the stress was on the first syllable, the peak amplitude difference was negative, which suggested that the first syllable had a higher amplitude. When the second syllable was stressed, the peak amplitude difference was positive, revealing that the second syllable had a higher amplitude. The relatively higher amplitude in a stressed syllable is a cue for listeners to tell a stressed syllable from an unstressed syllable in perception. Fry (1958) found in his. 10.

(24) perception test of disyllabic words that with the increase of the first-to-second-vowel amplitude ratio, the percentage of noun judgment rose. In other words, the higher the first vowel’s amplitude was, the more likely it was to perceive the first syllable as stressed. Morton and Jassem (1965) had a similar observation in their disyllabic words as well. The listeners were prone to mark the second syllable as stressed when the amplitude of the first syllable was lowered. When the amplitude of the second syllable went down, the majority of the listeners preferred to mark the stress in the first syllable. Generally speaking, the amplitude of a stressed syllable is higher than an unstressed syllable. This generalization was seen in the measurements of syllable production, and it was also observed from the recognition of stress in perception.. 2.1.4 Viewing English Lexical Stress as a Complex of Parameters Sections 2.1.1 to 2.1.3 have demonstrated how F0, duration and amplitude each contribute to the formation of stress individually. However, the three acoustic parameters have also been found to interact with each other, so this section is going to present how the parameters influence each other or how they collectively realize English lexical stress. As Fry (1958) put in his introduction to stress, the acoustic parameters of stress. 11.

(25) depend on each other and build up a complex, which was evidenced in the results of his experiments and also the ones in other studies. Earlier in the sections on duration and amplitude, it has mentioned that Fry (1958) found that the longer the first syllable was, the higher the percentage of noun judgment was; in addition, the percentage of noun judgment also increased with the rising of the first syllable’s amplitude. Besides these two findings, Fry (1958) also reported that when the first syllable became longer in duration and higher in amplitude at the same time, the increase of noun judgment was amplified. However, when duration and amplitude went in opposite directions (e.g., when a syllable’s duration was lengthened but its amplitude was lowered), the percentage of noun perception dropped. These observations indicated that duration and amplitude were able to strengthen or weaken each other. Lieberman (1960) examined the relation between F0 and peak amplitude in production. He discovered that when the F0 of the stressed syllable was lower than the unstressed counterpart, which is not a typical F0 realization for stress, the amplitude of the stressed syllable was definitely higher than the unstressed one. Following the same logic, when the amplitude of the stressed syllable was lower than the unstressed, which is not how stress is typically realized in amplitude, the stressed syllable absolutely had a higher F0 than the unstressed. In other words, there is a trade-off between F0 and peak amplitude in realizing English lexical stress. In. 12.

(26) addition to peak amplitude, Beckman (1986) also calculated the total amplitude of each syllable measured in her study. The total amplitude could be seen as the contribution from two parameters, amplitude and duration, because it was the amplitudes accumulated throughout the entire duration of a syllable. The difference of total amplitude was obtained by calculating the difference between the second syllable’s total amplitude and the first syllable’s total amplitude. It was found that iambic words had positive total amplitude differences, whereas trochaic words had negative values. The pattern of amplitude and duration working together was similar to the pattern of them working alone.. 2.1.5 Summary At this point, it has been manifested that F0, duration and amplitude all are the parameters contributing to the formation of English stress. A stressed syllable is higher in F0, longer in duration and higher in amplitude than an unstressed syllable, and these parameters are related to each other, so stress should be seen as a complex of parameters.. 2.2 Mandarin Lexical Tones Lexical tone is a feature that each Mandarin syllable possesses. There are four. 13.

(27) lexical tones4 in Mandarin. Chao (1948) described the 1st Tone (abbreviated as T1 hereafter) as a high-level tone, the 2nd Tone (T2) as a high-rising tone, the 3rd Tone (T3) as a low-dipping tone, and the 4th Tone (T4) as the high-falling tone. In order to make the illustrations of the four tones more concrete and accurate, this section will present how the four tones are realized with respect to fundamental frequency (Section 2.2.1), duration (Section 2.2.2), and amplitude (Section 2.2.3) in acoustic studies.. 2.2.1 Fundamental Frequency The four basic Mandarin tones have been observed to be different from each other in F0 in production. Moore & Jongman (1997) provided a plot of a female’s F0 when she pronounced the Mandarin syllable ma5 in each of the four tones as a demonstration of F0 changes. The track of F0 stayed high and flat around 245 Hz throughout the syllable for T1. When producing T2, the F0 started with a median height at 220 Hz, dropped a little to 210 Hz, and then went all the way up to a high point at 260 Hz. T3’s F0 began at a low point at 200 Hz and gradually descended to 180 Hz, which was followed by a rise back to 200 Hz. The starting point of T4 was The discussion of the neutral tone is not included here, so there are four (but not five) tones in total. The F0 pattern of the neutral tone varies according to the preceding tone (Jongman et al., 2006). Since the neutral tone does not have a constant F0 feature as the four tones do, it is excluded in the discussion here. 5 The Mandarin syllables are spelled in Pinyin in italic letters throughout this thesis. The tone of each syllable is indicated by the number on the right: the numbers 1 to 4 each correspond to T1, T2, T3 and T4. 4. 14.

(28) the highest point at 270 Hz and it dropped drastically to the lowest point at 180 Hz. The F0 tracks of the tones are consistent with Chao’s (1948) descriptions, with the exception of T2, which in fact has a short period of minor down-going F0 before the major rising. The track of F0 can influence the perception of Mandarin tones. Yang (2010) carried out an experiment on the perception of the resynthesized syllable tao with varying staring points and ending points of F0, followed by the syllable qian2 (錢) “money” or shui4 (稅) “tax”. The syllable tao, according to Yang, can mean “pay” if pronounced in T1 (掏), “avoid” in T2 (逃), “ask for” in T3 (討), and “drag” in T4 (套). It was found that a syllable having a starting F0 value close to the ending value would be perceived as T1. If a syllable had an obviously lower starting point than the ending, it would be perceived as T2 or T3. When the starting point was apparently higher than the ending, the syllable would be perceived as T4. This result did not specify the factor(s) that could differentiate T2 from T3. Shen and Lin (1991) investigated the differences between T2 and T3 in perception. They synthesized falling-rising tones that varied in the two perspectives: the magnitude of the falling and the timing of the turning point (i.e., the ending point of the falling part). The magnitude was defined by the F0 difference between the starting point and the turning point, and the timing of the turning point was defined by the percentage of the length of the falling part over. 15.

(29) the length of the whole syllable; for example, a turning point at the 25 ms of a 250-ms-long syllable would be defined as “the 10% turning point.” It was found that when a tone had a 30 Hz falling magnitude and a turning point that occurred before the 40% timing, it would mostly be identified as a T2. The tones with a 30 Hz falling magnitude and a turning point later than the 40% timing would mostly be identified as a T3. In the tones with a 15 Hz falling magnitude, turning points prior to around the 60 – 70 % timing would lead to a T2 perception, while turning points after the 60 – 70 % timing would be perceived as T3. Different T2/T3 thresholds based on the timing of turning point were found for different falling magnitudes, which indicated that the turning point timing and the falling magnitude could help differentiate T2 from T3.. 2.2.2 Duration As demonstrated by Moore & Jongman’s (1997) measurements, the four tones differed in their duration. T2 and T3 were similar in length (almost 300 ms), and they were the longest among the four tones. T1 (about 250 ms) was slightly shorter than T2 and T3. T4 (about 175 ms) was the shortest. Moore & Jongman’s (1997) measurements did not show an apparent difference between T2 and T3 in duration. In Shen (1990), it was found that T3’s duration is actually longer than T3. Liu & Samuel (2004) examined the relation between the four tones’ duration and. 16.

(30) identification. They used whispered Mandarin tones in their perception test in order to remove the cue from F0. It was found that the participants were able to identify T1, T3 and T4 based on the tones’ durations. According to Liu & Samuel, when whispering, the effect of F0 was not realized; hence, exaggerated duration contrasts were utilized by the listeners to differentiate tones. In conclusion, duration does differentiate the four tones in perception; however, the differentiation based on duration is possible when the durations are produced in exaggeration, in order to compensate for the lack of F0 contrasts.. 2.2.3 Amplitude Whalen & Xu (1992) provided their measurements of the amplitude contour of a male speaker producing the Mandarin syllable yi in the four tones. The contours all began with a rise and ended with a drop, with a period of fluctuation in between. T1 had the highest peak amplitude. The peak amplitude of T4 was the second highest. T3’s peak amplitude was the lowest, and T2 was the second lowest. In Whalen & Xu’s (1992) perception test, they used a metronome to pace a male speaker to produce the four tones at different speeds, which gave rise to the production of the four tones in their typical and atypical durations. For each tone, the token with a duration typical of the tone was selected as the experiment material. In. 17.

(31) addition, the tokens with durations atypical of the tone but typical of other tones were also selected. In other words, among the tokens selected, each tone was produced in four different durations, the durations typical of T1, T2, T3 and T4. The F0 cues of all the tokens were removed. Therefore, only the duration and amplitude cues were preserved. When examining the perception of each tone with its typical duration, the chance of correctly identifying T2, T3 and T4 were higher than 50%, with T1 lower than 50%. When examining the perception of each tone with all the four durations, the chances of “properly” identifying a tone based on its duration (e.g., identifying T2 as T3 when the T2 had a duration that was typical of T3) were all lower than 50%. If the listeners’ identification of tones was totally dominated by duration, they should perceive a T2 which had a duration typical of T3 as T3. However, the low percentage of this kind of identification showed that it was not very likely. Therefore, duration was not the dominant cue. In this experiment, the only two types of cues available were duration and amplitude. It appeared that, when the F0 cue was unavailable and the duration cue was misleading, the amplitude cue could override the duration cue and enable the listeners to identify tones correctly.. 2.2.4 The Dominance of F0 in Mandarin Tones The previous sections have shown that Mandarin tones differ from each other in. 18.

(32) terms of F0, duration and amplitude in production. In addition, each of the three acoustic parameters contributes to the recognition of the tones. However, F0 is the dominant cue among the three for identifying Mandarin tones. Looking at Mandarin tones from a descriptive approach, Chao (1948) illustrate the four tones as “high-level”, “high-rising”, “low-dipping” and “high-falling” based on audible pitch, which is the equivalent of F0 in acoustics. Also, as pointed out by Liu & Samuel (2004), F0 cues are used to teach children or adults when they are learning Mandarin. If F0 were not the important dominant cue for Mandarin tones, it would not be adopted to describe or teach Mandarin tones. In studies investigating Mandarin tones from the acoustic perspectives, F0 has been considered the primary cue (Whalen & Xu, 1992; Liu & Samuel, 2004; Jongman et al., 2006). Though duration and intensity has been found to contribute to the recognition of Mandarin tones, they only served as a secondary cue. For example, as mentioned earlier, Liu & Samuel (2004) found that when F0 cues were not available, duration could help distinguish T1, T3, and T4. Also, Whalen & Xu (1992) found that listeners were able to recognize T2, T3 and T4 based on amplitude contour when F0 cues were removed. In these two studies, duration and amplitude only worked as a “backup parameter” when the parameter F0 was absent. Besides, they were only able to differentiate three but not four tones. The “backup” quality and the weakness in. 19.

(33) differentiating four tones completely showed that duration and amplitude cannot be the effective dominant cue for differentiating Mandarin tones. To sum up, Mandarin tones differ from each other in F0, duration and amplitude in production; however, when it comes to perception, F0 is the dominant cue.. 2.3 Comparison between English Lexical Stress and Mandarin Lexical Tone The previous sections have reviewed English Lexical Stress and Mandarin Lexical Tone with regard to their acoustic parameters, and this section will examine the differences between the two. English stress is a suprasegmental feature, and so is Mandarin tone. However, these two features are different in many ways. First, English stress is the outcome from the comparison between the stressed and the unstressed syllable in a word. As presented in the previous sections, a stressed syllable is higher in F0, longer in duration and higher in amplitude than the unstressed counterpart. English stress cannot be defined by the properties of a syllable by itself. On the other hand, Mandarin tones can be demonstrated by a syllable on its own. Therefore, in the investigation of English stress, the words to be inspected are the ones with at least two syllables, while investigating Mandarin tones requires only monosyllable words. Second, a syllable in English is either stressed or unstressed,. 20.

(34) meaning that English syllable fall under two major categories, whereas in Mandarin, there are four tones that a syllable can possess. Third, although both Mandarin tones and English stress have been found to correlate with F0, duration, and intensity, English stress is a complex of the three parameters, while Mandarin tones rely mainly on F0. With the three differences between English stress and Mandarin tones: realization by the comparison between syllables vs. realization by the syllable itself, two major categories vs. four, and a complex of three parameters vs. relying only on one, it is expected that Mandarin speakers would have a hard time learning English stress. The following section is going to review previous studies on Mandarin speakers acquiring English stress, with the aim to reveal the problems they have in production and perception.. 2.4 Previous Research on Mandarin Speakers’ Acquisition of English Lexical Stress This section will present previous studies on the acquisition of English lexical stress by Mandarin speakers. In Section 2.4.1, production studies will be reviewed. In Section 2.4.2, perception will be reviewed. Section 2.4.3 will discuss the gaps for further research. Although the current study works on the perception of stress, the. 21.

(35) studies about the production of stress will also be reviewed because it is highly likely that the production of stress is correlated with the perception of stress.. 2.4.1 Production Studies The following studies investigated both Mandarin and English speakers’ production of English stress. Acoustic measurements were used to compare their realizations of stress.. 2.4.1.1 Lai (2008) Lai (2008) had two groups of Mandarin speakers and a group of English speakers pronounce disyllabic English word pairs as well as monosyllabic-disyllabic word pairs. The two groups of Mandarin speakers were beginning and advanced learners of English. The disyllabic word pairs were noun-verb pairs that contrasted in stress location, e.g., CONtract vs. conTRACT. The monosyllabic-disyllabic word pairs consisted of a monosyllabic stressed word and a disyllabic word which contained a syllable that was the unstressed version of the monosyllabic word, e.g., KEY vs. MONkey. The participants were asked to pronounce a sentence in which the targeted words occurred and then repeat the targeted word after the sentence. The repeated targeted word at the end of each sentence was measured.. 22.

(36) First-to-second syllable ratios based on mean F0, max F0, amplitude and duration were calculated for the disyllabic word pairs. It was found that the Mandarin speakers used all of the acoustic parameters (mean F0, max F0, amplitude and duration) to show the contrasts between stressed and unstressed syllables in both nouns and verbs. When English speakers were producing the nouns, the four acoustic parameters were utilized. However, when they were producing the verbs, duration was the only parameter they used to show stress contrast. The monosyllabic-disyllabic word pairs were used to examine F0 contour. Generally speaking, the F0s of the monosyllabic words’ stressed syllables were found to be higher than the unstressed counterparts in the disyllabic words. When pronouncing the monosyllabic words, the Mandarin speakers produced F0 contours that resembled the T4 in Mandarin, while the English speakers’ F0 contours were relatively flat. When producing the stressed syllables of the disyllabic words, e.g., the first syllable in monkey, the Mandarin speakers’ F0 contours were similar to the high level tone in Mandarin, while the English speakers’ F0 contour patterns were diverse. According to the Mandarin speakers’ performance in the study, Lai pointed out that Mandarin speakers’ use of F0 is more inflexible (e.g., always using F0 as a cue for stress even when English speakers would not, as in the cases of verbs, and adopting Mandarin-tone-like F0 patterns when English speakers’ F0 patterns were flat. 23.

(37) and diverse). Lai attributed their inflexibility to the influence of Mandarin, since each Mandarin tone has a fixed pattern of F0 contour.. 2.4.1.2 Zhang et al. (2008) Zhang et al. (2008) included one group of Mandarin speakers and one group of English speakers for a production test on disyllabic noun-verb word pairs (e.g., CONtract vs. conTRACT). They showed a context sentence to the participants as a clue of which word in the pair they should pronounce each time. The F0, duration and amplitude of every syllable they produced were measured. Two obvious differences in F0 between the Mandarin and English speakers were found. First, the Mandarin speakers’ F0s in the stressed syllables were much higher than those produced by the English speakers. Second, in the production by Mandarin speakers, the locations of F0 peaks in stressed syllables were significantly later compared to unstressed syllables, which was not a pattern found in the English speakers. The two differences were discussed by Zhang et al. as the following: First, Mandarin speakers’ relatively higher F0s came from the rather higher register of T1 in Mandarin. In other words, the Mandarin speakers transferred the habits of Mandarin tone production into English stress production. Second, the later occurrence of F0. 24.

(38) peak was also attributed to the influence of Mandarin. In Mandarin, the longer a syllable is, the later the location of the syllable’s F0 peak is (Xu, 1999). The stressed English syllables were produced with longer duration by Mandarin speakers, and they brought the habit of Mandarin into English stressed syllables; therefore, the stressed syllables were not only longer in duration but also reached their F0 peaks later.. 2.4.1.3 Tseng et al. (2013) Tseng et al. (2013) looked into a group of Mandarin speakers and a group of English speakers’ production of English words ranging from one syllable to four syllables. The syllables were measured in terms of F0, duration and amplitude. The contrasts between the stressed and unstressed syllables in the three types of measurements were also calculated. Tseng et al. found that Mandarin speakers’ contrasts in F0, duration and amplitude were all smaller than English speakers’. Among the three types of contrasts, English speakers’ and Mandarin speakers’ difference in F0 was the most noticeable: The English speakers’ F0 contrast was the double of the Mandarin speakers’. There was underdifferentiation for F0 in the Mandarin speakers’ stress production. Tseng et al. did not explicitly attribute the F0 underdifferentiation to L1 influence. However, they consider the underdifferentiation an important feature of. 25.

(39) “Taiwan-accented” English. The term “Taiwan-accented” might imply that the participants’ production differed from English speakers due to the native language they spoke in Taiwan.. 2.4.1.4 Summary In general, among the three acoustic parameters (i.e., F0, duration and amplitude), the most outstanding differences between Mandarin and English speakers were in F0. Judging from the authors’ interpretations of these differences, first language influence might be the cause.. 2.4.2 Perception Studies The studies below investigated how English stress was perceived by English and Mandarin speakers. Resynthesized versions of English words or nonsense words were used in their perception tests. Mandarin speakers’ perception and English speakers’ were examined with regard to three acoustic parameters: F0, duration and amplitude.. 2.4.2.1 Lai (2008) Lai (2008) carried out a perception test of stress on one group of English speakers and two groups of Mandarin speakers (beginning and advanced learners of. 26.

(40) English). The pronunciation of the nonsense word dada was recorded from a native speaker of English. This word was resynthesized into tokens that varied in the first-to-second-syllable ratio of max F0 and duration. The participants were asked to take a forced-choice test in which they had to identify whether the stress was on the first or the second syllable. The three groups in the study were found to behave differently in terms of F0 and duration. Both F0 and duration could be a cue for the English speakers to identify stress, but F0 was the major cue they used. Among the Mandarin speakers, the beginning learners of English perceived stress based on duration but not F0. The advanced learners’ perception was correlated with both duration and F0 cues, and F0 was the main cue. This study showed the effects of proficiency (i.e., the difference in performance between the beginning and advanced English learners), which is an issue that other studies did not deal with.. 2.4.2.2 Wang (2008) Wang (2008) looked into a group of English speakers and a group of Mandarin speakers’ perception of the nonsense words tetsep, ruzdit and latmab.6 The These nonsense words were originally spelled in IPA by the author as: tɛt.sɛp, nɪz.dɪt and læt.mæb. They were spelled with possible corresponding letters in English alphabet here for the consistency of presenting nonsense words in this paper. 6. 27.

(41) pronunciations of the three words were recorded from a native speaker of English. The recoding was manipulated to have varying F0s, durations and amplitudes. The participants were asked to take a forced-choice test in which they had to decide if the stress is on the first or second syllable. It was found that F0, duration and amplitudes were all crucial in English speakers’ recognition of stress. On the other hand, Mandarin speakers relied solely on F0 to perceive stress. In addition, Mandarin speakers were way more sensitive to F0 changes than English speakers. The author explained the Mandarin speakers’ performance with L1 influence. The Mandarin speakers perceived English stress in the way they would perceive Mandarin tones: When perceiving Mandarin tones, they would rely entirely on F0, and they transferred this reliance into English to perceive stress.. 2.4.2.3 Zhang & Francis (2010) Zhang & Francis (2010) included one English-speaker group and one Mandarin-speaker group in their perception study. The noun-verb word pair DEsert vs. deSERT was recorded from an English speaker, and the words were resynthesized into tokens that differed in F0, duration and amplitude as the stimuli in the study. The participants participated in a forced-choice test where they had to decide whether the. 28.

(42) token they heard was a noun (with the stress on the first syllable) or a verb (with the stress on the second syllable). It was found that the effect of F0 was stronger on the Mandarin speakers than the English speakers. The authors contributed the Mandarin speakers’ heavier reliance on F0 to the influence of Mandarin tones because F0 is the crucial cue to differentiate the four tones in Mandarin.. 2.4.2.4 Ou (2010) Ou (2010) worked on Mandarin speakers’ perception of stress patterns elicited in an affirmative statement and in a yes-no question. The participants were asked to take a forced-choice test in which they had to decide whether the disyllabic nonsense word they heard was a noun or a verb. The nonsense word stimuli (FERcept vs. ferCEPT) were taken from the end of an affirmative statement, which was a falling intonation pattern. In this pattern, the stressed syllable in the target word was signaled by a higher F0. The nonsense word stimuli (FERcept vs. ferCEPT) were also taken from the end of a yes/no question, in which a trochaic word would have a high rising F0 pattern in the second syllable, while an iambic word would have a low rising F0 pattern in the second syllable. In the iambic and trochaic word forms taken from the yes/no question, the F0 of the second syllable is generally higher than the first;. 29.

(43) therefore, the feature that differentiated the two was the second syllable’s starting point of the rising F0 (high rising vs. low rising). It was found that, when hearing the trochaic word taken from the yes/no question, some Mandarin speakers were misled by the relatively higher F0 in the second syllable and recognized the word as iambic. L1 influence was attributed as a reason for the Mandarin speakers’ heavy reliance on F0 cue in the test. Although the current study is not going to work on the effect of context (e.g., affirmative statement vs. yes/no question) on stress perception, Ou’s (2010) finding can be used as a reference for the general tendency that Mandarin speakers’ stress perception is prone to L1 influence.. 2.4.2.5 Chrabaszcz et al. (2014) Chrabaszcz et al. (2014) investigated a group of Mandarin speakers’ and a group of English speakers’ stress perception of the nonsense word maba. An English speaker pronounced the word, and it was resynthesized into tokens varying in F0, duration, and amplitude. A forced-choice test was carried out to examine whether the participants recognized the stress in the first or second syllable in each token. Based on the participants’ responses, each language group’s weighting of the acoustic parameters was calculated. The authors were expecting a difference between. 30.

(44) the Mandarin speakers’ and the English speakers’ weighting. However, it turned out that the two groups had the same weighting: Pitch was more important than amplitude, which was more important than duration. The authors concluded that despite the difference between the two languages (i.e., tonal system vs. stress system), it is still possible for Mandarin speakers to achieve native-like perception of English stress.. 2.4.2.5 Summary To conclude from the perception studies above, the most evident difference between the English and Mandarin speakers’ stress perception is in F0, and most of the studies associated this phenomenon with the influence from Mandarin tones.. 2.4.3 Gaps for Further Research This section will present the research gaps found in the previous studies on Mandarin speakers’ performance on English stress. Since the current study aims at the perception of English stress, the flowing discussion will be mainly based on the perception studies. The first issue is the choice of test type. Every perception study reviewed above (Lai, 2008; Wang, 2008; Zhang & Francis, 2010; Chrabaszcz et al., 2014) used a forced-choice test to elicit the participants’ responses – they were forced to recognize. 31.

(45) each stimulus as either an iambic or a trochaic pattern. Among the stimuli that were manipulated to vary in their acoustic parameters, there might be some stimuli that do not sound clearly iambic or trochaic to the participants. Therefore, when the participants were required to choose between the two options for every stimulus, there might be the cases where they randomly picked one because they were unable to identify the stress. To avoid this issue, in the current study, a Likert scale will be adopted to test the participants’ perception. A five-point Likert scale was used in Beckman (1986) for examining native speakers’ stress perception, with the five points “clearly trochaic,” “unclear but closer to trochaic,” “cannot tell,” “unclear but closer to iambic” and “clearly iambic.” This Likert-scale will be adopted in the current study to examine Mandarin speakers’ perception of stress. The second issue with the previous perception studies is participant grouping. Lai (2008) included two groups of Mandarin speakers: a group of beginning learners of English and a group of advanced learners, and the two groups did perform differently in stress perception. However, this between-proficiency-group difference among Mandarin speakers was ignored in other perception studies, which discussed Mandarin speakers’ performance as a whole group. The difference in performance between the two groups with two different English proficiency levels in Lai (2008) sheds light on the fact that proficiency might be a crucial factor in stress perception,. 32.

(46) but there is a lack of similar discussions in other studies. This gap brings up the necessity for further research on Mandarin speakers’ stress perception in relation to their English proficiencies, which will be investigated in the current study. The third issue is the explanation on Mandarin speakers’ performance. Most of the previous studies on production and perception (Lai 2008; Wang, 2008; Zhang et al., 2008; Zhang & Francis, 2010) contributed Mandarin speakers’ performance to L1 influence. However, it appears that the discussion of factors other than L1 is needed. Chrabaszcz et al. (2014) found that their Mandarin speakers had the same weighting of acoustic cues in stress perception as English speakers, which was beyond their expectation that the two groups’ weighting would be different due to L1 influence. Since the explanation on Mandarin speakers’ stress perception with L1 influence does not always work, there is the need to explore other factors that can explain Mandarin speakers’ stress perception. The current study chooses to discuss stress perception in relation to the ability in perceiving musical patterns. Musical perception has been found to be related to linguistic perception. Section 2.5 below will review studies dealing with this relation.. 2.5 The Relation between Musical Perception and L2 Perception This section will discuss how musical perception might be related to L2. 33.

(47) perception. The relation will be considered in two opposite directions: the influence of musical perception on linguistic perception vs. the influence of linguistic perception on musical perception. The former examines the linguistic perception of participants with different musical backgrounds, while the latter looks into the musical perception of participants with different linguistic backgrounds.. 2.5.1 The Influence of Musical Perception on Linguistic Perception Previous studies have found that musical perception might have an effect on L2 perception. Lee & Hung (2008) studied English speakers’ perception of Mandarin tones. The English speakers included musicians and non-musicians. The syllable sa was pronounnced in the four Mandarin tones by 32 Mandarin speakers. Each syllable was modified into three versions: intact syllable, center-silenced syllable and onset-only syllable. The English speakers were taught about the four tones in Mandarin and then they were asked to identify the tone of each syllable they heard. It was found that the musician English speakers had higher accuracy in tone identification for the intact syllables and center-silenced syllables than the non-musician English speakers. This finding implies that musical training might strengthen the perception in L2. Boll-avetisyan et al. (2016) examined how French speakers’ stress perception. 34.

(48) was related to their musical backgrounds. The study’s participants were French speakers who learned German as an L2. The participants’ musical experiences, such as the instruments they played and how long they had been playing them, were documented. In the test of their stress perception, they were asked to identify whether the synthetic CV syllable sequences they heard were a trochaic or an iambic pattern. The performance of the French speakers who learned German as an L2 were compared to that of the French and German monolinguals in Bhatara et al. (2013), who were also tested with the same stimuli. As reported by Boll-avetisyan et al., musical experience could affect the perception of stress pattern. It was a positive predictor that predicted German-like stress perception in the French speakers. This study showed that increased experience in music could enhance L2 learning in stress perception. Marques et al. (2007) looked into French speakers’ perception of pitch changes in Portuguese sentences. The French speakers included musicians and non-musicians, and they did not understand Portuguese. They were required to listen to Portuguese sentences that had normal pitch, manipulated slightly high pitch or manipulated greatly high pitch on the last two syllables. The participants had to answer whether each sentence sounded normal or not. Their reaction times and ERP responses to the sentences were also measured. The sentences with manipulated high pitch were. 35.

(49) expected to sound abnormal to the participants. The results showed that the musicians’ accuracy rate in identifying the sentences with abnormal pitch was higher than the non-musicians. When making their judgments, the musicians’ reaction time was shorter than the non-musicians’. The participants’ higher positive ERP amplitudes induced by the sentences with greatly higher pitch occurred earlier in musicians than in non-musicians. As mentioned by the author, the later occurrence of ERP response was related to the relative greater difficulty for the participants. In other words, it was more difficult for the non-musicians than the musicians to judge the sentences, so non-musicians’ ERP responses occurred later. This study reflected that musical backgrounds could lead to higher sensitivity to pitch changes in language. The relation between musical perception and L2 perception has been justified in segment, stress pattern and general pitch patterns. The relation was viewed in the direction of “music to language,” i.e., examining the linguistic performance of participants with different musical backgrounds. The following section will review the music-language relation in the opposite direction: “language to music.”. 2.5.2 The Influence of Linguistic Perception on Musical Perception This section will present studies that worked on musical perception in participants with different language backgrounds. Previous research has discovered. 36.

(50) that linguistic ability could have an effect on musical perception. The following studies cover the effect of linguistic ability on musical perception. Deutsch et. al (2009) studied absolute pitch (being able to recognize a musical note when there are no other notes provided as a reference) in relation to the fluency in a tonal language. The participants were music students and orchestra members in California. The participants with an East Asian ethnicity differed in their fluency in a tonal language. The three levels of fluency were very fluent, fairly fluent and not fluent. The participants were asked to name each musical tone they heard, which ranged from C3 (131 Hz) to B3 (988 Hz). It was found that the very-fluent-in-a-tonal-language group had higher accuracy in naming the tones than the fairly-fluent group, and the fairly-fluent group was higher than the not fluent-group. This result shows that the proficiency in a tonal language might be correlated with the perception of musical tones. In Wong et al. (2012), Cantonese speakers from Hong Kong and English speakers from Canada were recruited to take a musical test. They had to identify whether there was an incongruity in the musical pattern they heard, e.g., offbeat or out of key. It was found that, generally speaking, the Cantonese speakers surpassed the English speakers in detecting the musical incongruities. The most striking difference between the two groups was in identifying out-of-key patterns. The Cantonese. 37.

(51) speakers’ superior ability in spotting musical pitch incongruities was attributed to their native language. Their pitch perception was enhanced because their native language, Cantonese, is a tonal language. This study associated the perception of musical patterns with the speakers’ native language. Roncaglia-Denissen et al. (2013) investigated the rhythmic perception in participants who were learners of a second language. The study recruited Turkish speakers who learned German as an L2 and German speakers who learned English as an L2. As mentioned by the author, the rhythmic properties of German were considered closer to English than Turkish. Both English and German are stress-timed languages, and trochee is the preferred metrical pattern in the two languages. However, Turkish is a syllable-timed language, and the metrical pattern preferred in Turkish is iamb. With participants that differed in the similarity between their two languages, the study was able to investigate whether learning a language with a different rhythmic system would enhance musical rhythmic perception. The participants were asked to judge if the two rhythmic phrases they heard were the same or not. It was found that the Turkish speakers’ performance in rhythmic perception was better than the German speakers. The author associated the Turkish speaker’s outperforming with the dissimilarity of metrical properties between Turkish and German. Learning a second language with a different rhythmic system (German). 38.

(52) facilitated the Turkish speakers’ ability in perceiving rhythms, which resulted in their better performance in musical rhythmic perception. On the other hand, learning a second language with a similar rhythmic system (English) did not help improve German speakers’ ability in perceiving rhythms. Therefore, they were less sensitive in perceiving the difference in musical rhythms. This study manifested the effect of second language acquisition on musical perception. Roncaglia-Denissen et al. (2016) examined the rhythmic and melodic perception in monolingual participants as well as participants who learned a second language. The participants included Mandarin-speaking learners of English, Turkish-speaking learners of English, Dutch-speaking learners of English and Turkish monolinguals. The Turkish monolinguals were not exposed to a second language, but they were exposed to the compound meter in Turkish music, which was considered a kind of musical complexity. A comparison between the Turkish monolinguals and the non-monolinguals was used to examine which kind of exposure (L2 or musical complexity) could facilitate musical perception. The participants were required to identify whether the melodic phrases they heard were the same or not in the melodic aptitude test. In addition, they had to identify the rhythmic phrases played to them as the same or different in the rhythmic aptitude test. The result turned out that the non-monolingual groups outperformed the Turkish monolingual group in both the. 39.

(53) melodic aptitude test and the rhythmic aptitude test. The result suggested that L2 exposure, rather than musical complexity exposure, enhanced melodic and rhythmic perception. This study demonstrated that acquiring a second language could lead to better language perception.. 2.5.3 Summary The relation between musical perception and L2 perception has been shown in the direction of “music to language” in Section 2.5.1 and in the direction of “language to music” in Section 2.5.2. Increased experiences in musical perception could improve L2 perception, and acquiring a second language could enhance musical perception. Though only one study presented above (Boll-avetisyan et al., 2016) is specifically about the strengthening effect of musical ability on stress perception, others studies altogether showed that musical ability and linguistic ability could facilitate each other in general. Therefore, in the current study, Mandarin speakers’ perception of English stress will be investigated in relation to their musical perception, in order to see if the music-language relation could explain Mandarin speakers’ performance.. 40.

(54) Chapter Three: Research Design This chapter presents the research design of the current study. A perception test of English lexical stress and a perception test of pure tones are presented in terms of their participants (in Section 3.1), materials (in Section 3.2), procedures (in Section 3.3) and data analysis (in Section 3.4).. 3.1 Participants For both the stress perception test and pure tone perception test, 20 native speakers of Mandarin and 7 native speakers of English were recruited from National Taiwan Normal University. The English speakers came from the United States. The Mandarin speakers were from the English Department at NTNU. The Mandarin speakers were further divided into two proficiency groups for analysis. The grouping was based on their TOEIC scores in the past 2.5 years.7 Seven of them whose TOEIC scores were below 880 were assigned to the intermediate-proficiency group (Group MI). Another seven of the Mandarin speakers who scored above the threshold were assigned to the advanced-proficiency group (Group MA). Six of the Mandarin speakers were not included in the analysis due to a lack of TOEIC score or taking the test more than 2.5 years ago. Each participant had to pass an audiometric test at 25 dB The majority of the Mandarin speakers took the TOEIC test. Only a few of them took New TOEIC, IELTS, TOEFL and GEPT. For those who took a non-TOEIC test, a conversion table provided on the website of Center of Language Education at National Kaohsiung First University of Science and Technology was used to convert their scores to TOEIC. URL: http://cle.nkfust.edu.tw/ezfiles/123/1123/img/1825/783334492.pdf 7. 41.

(55) at 250, 500, 750, 1000, 2000 and 3000 Hz. Since the tests in this study are perception tests based on hearing, a prescreening audiometric test is necessary for ensuring that the participants’ performance would not be affected by hearing loss.. 3.2 Materials The stimuli in the stress perception test were the resynthesized versions of a nonsense word, prodawn. Synthesized pure tones were used in the pure tone perception test. The details of synthesis/resynthesis are presented below.. 3.2.1 The Resynthesized Disyllabic Nonsense Words Following Lai (2008), Wang (2008) and Chrabaszcz et al. (2014), a nonsense word rather than a real word was chosen because the listeners’ perception of real words might be influenced by how familiar they are with the words and where they expect the stress should fall. With a nonsense word, which they have never heard, there would not be the issue of familiarity and expectancy. The disyllabic nonsense word prodawn was chosen for the stress perception test. A native speaker of English from the United States was asked to pronounce the word in the noun form with stress on the first syllable and in the verb form with stress on the second syllable. The native speaker’s pronunciation was recorded on Praat at NTNU’s Phonetics Lab. The F0,. 42.

參考文獻

相關文件

Salas, Hille, Etgen Calculus: One and Several Variables Copyright 2007 © John Wiley & Sons, Inc.. All

(c) Draw the graph of as a function of and draw the secant lines whose slopes are the average velocities in part (a) and the tangent line whose slope is the instantaneous velocity

different spectral indices for large and small structures Several scintil- lation theories including the Phase Screen, Rytov, and Parabolic Equa- tion Method

了⼀一個方案,用以尋找滿足 Calabi 方程的空 間,這些空間現在通稱為 Calabi-Yau 空間。.

• ‘ content teachers need to support support the learning of those parts of language knowledge that students are missing and that may be preventing them mastering the

Robinson Crusoe is an Englishman from the 1) t_______ of York in the seventeenth century, the youngest son of a merchant of German origin. This trip is financially successful,

fostering independent application of reading strategies Strategy 7: Provide opportunities for students to track, reflect on, and share their learning progress (destination). •

Now, nearly all of the current flows through wire S since it has a much lower resistance than the light bulb. The light bulb does not glow because the current flowing through it