歌詞的語義理解：事件相關腦電位研究

全文

(1)國立臺灣師範大學英語學系碩. 士. 論. 文. Master Thesis Department of English National Taiwan Normal University. 歌詞的語義理解：事件相關腦電位研究. Semantic Processing in Lyrics Perception: An ERP Study. 指導教授：詹曉蕙博士 Advisor: Dr. Shiao-hui Chan 研究生：簡珮如 Student: Pei-Ju Chien. 中華民國一百零二年一月 January, 2013.

(2) 摘要本文旨在以事件相關腦電位的技術來探討歌詞的語義理解歷程及歌曲熟悉度對此理解歷程之影響。過去研究語言與音樂的相關文獻指出，一般人在聽歌時，會對歌曲中的歌詞進行語義理解的分析，因而認為此語義分析與口語語言(spoken language)的理解歷程相似。然而，由於實驗目的不同及實驗操弄可能產生的混淆變項，關於歌詞的語義理解仍需更多討論。本實驗以跨感官知覺的語義促發作業(cross-modal semantic priming) 為設計，在受試者聽完一段熟悉或不熟悉的中文流行歌曲片段後，對受試者以視覺方式呈現與該歌曲片段的句尾促發詞(prime)之語義相關或不相關的目標詞(target)，並請受試者對於該目標詞進行名詞具體度的判斷。實驗結果發現，在目標詞與促發詞語義不相關的情況下，引發較語義相關為大的 N400，顯示受試者在聽歌時，確實對於歌曲中的歌詞進行語義理解的分析。此外，在目標詞配對之促發詞為熟悉歌詞的情況下，N400 的分布為中間偏左腦；然而，在目標詞配對之促發詞為不熟悉歌詞的情況下，N400 的分布為中間偏右腦。此 N400 在左右腦分布上的差異顯示，受試者可能採取不同的語義分析策略來理解熟悉與不熟悉的歌詞。由於受試者對於熟悉的歌詞有較高的字詞期待 (word expectancy)，因此當受試者聽到熟悉的歌曲時，傾向用以左腦為主的預測式策略來進行歌詞理解。相較之下，受試者對於不熟悉的歌詞則有較低的字詞期待，加上中文歌詞的聲調(tone)在歌曲中因受到音樂影響而變得較不易辨識，因此當受試者聽到不熟悉的歌曲時，傾向用以右腦為主的整合式策略來進行歌詞理解。 i.

(3) 關鍵詞：事件相關腦電位，歌詞，語義理解歷程，歌曲熟悉度，N400. ii.

(4) Abstract. Song lyrics are serving as a prevailing and important media communicating meanings in our daily life aside from written or spoken language. However, whether song lyrics are processed at a semantic level in song perception still remains less explored. While earlier studies investigating the relationship between lyrics and tune processing have reported that lyrics processing is conducted in the same way as spoken language which involves semantic processing, the results might have been confounded by methodological issues and thus could simply be task-induced effects. Therefore, the goal of the present study is to further explore lyrics processing with the ERP technique, testing whether people process lyrics semantically in song perception and whether song familiarity has an impact on such processing. A semantic priming paradigm was employed in the experiment. The subjects were aurally displayed a Chinese pop song excerpt before seeing a semantically related or unrelated target presented visually on the computer screen. Their task was to decide whether the target word is concrete or abstract. The results showed that an N400 effect was elicited in the comparison of semantically related vs. unrelated targets, suggesting that the subjects were processing the meaning of the lyrics stimuli in song perception. Furthermore, the ERP results showed a different pattern of N400 in hemispheric distribution between the familiar and the unfamiliar lyrics conditions: the N400 effect for familiar lyrics processing was significant in the midline. iii.

(5) and the left hemisphere, while for unfamiliar lyrics processing, the effect was significant in the midline and the right hemisphere. This hemispheric discrepancy of N400 indicated that the subjects might have used two different strategies in lyrics processing based on their familiarity toward the songs, prediction and integration. For the familiar lyrics, the subjects adopted a predictive manner in lyrics processing by bearing a relatively high word expectancy toward the lyrics. While for the unfamiliar lyrics, the subjects might be using an integrative strategy in lyrics processing due to their relatively low word expectancy on the lyrics and, more importantly, the obscure tonal feature of the lyrics in a song, which might have made it more difficult to access the semantic meaning of words. In summary, by examining the subjects' on-line processing in lyrics perception, the present study suggested that people process lyrics semantically in song perception and that they employ different strategies in processing songs with different familiarity.. Keywords: Event-related Potentials (ERPs), Song lyrics, Semantic Processing, Familiarity, N400. iv.

(6) Acknowledgements. It is an exciting moment for me to officially show my thankfulness to the people who accompanied me on the way of constructing this thesis. First of all, I would like to express my sincerest gratitude to my advisor, Prof. Shiao-hui Chan. It is Prof. Chan's neurolinguistics class that opened my eyes to the beauty of neuroscience and language. I would like to thank Prof. Chan for her always being so inspiring and patient in guiding me through conducting research and analyzing the data from a deeper perspective. Most of all, I would like to thank Prof. Chan for encouraging me nonstop with her heartwarming smile that I would be able to regain confidence every time after our meeting. No matter in academic career or in life, I am deeply appreciated to have been learning so much from Prof. Chan. I would also like to show my appreciation to my committee members, Prof. Chia-ying Lee and Prof. Yow-yu Lin. Their insightful comments help make this thesis more complete. My appreciation also goes to the professors who have taught me in the linguistics classes at NTNU: Prof. Chun-yin Doris Chen, Prof. Charles Chien-jer Lin, Prof. Hsiao-hung Iris Wu, Prof. Hsueh-o Lin, Prof. Hsi-yao Su, Prof. Hui-shan Lin, Prof. Jing-lan Joy Wu, Prof. Jen-i Li, Prof. Kwock-ping Tse, Prof. Miao-lin Hsieh, Prof. Shu-kai Hsieh and Prof. Yung-o Biq by the alphabetic order. Their professional instructions have broadened my horizon among the different intriguing fields of linguistics. Furthermore, I would like to thank Prof. He-ping. v.

(7) Feng for her being so encouraging and heartwarming in our every talk. The efforts from all the subjects in my experiment are also recognized. It is their generous help that made this thesis practical. To keep them anonymous, I would not name each of them here. Moreover, I would like to thank my classmates Abbie, Ann, Bebe, Bonnie, Gina, Katherine, Lina, Monica, Sam, Vicky for their constant encouragement. I would also like to thank Knify for her company and great sense of humor along the way from college to graduate school, and to thank Yung-ting for her cheerful messages when we were both doing our theses. Special thanks to my dear girls Pin-chi, Rocksoul, Yu-jung, Yu-ling and Yu-ting for their long-lasting understanding when I was extremely anxious and stressed. My sincere thankfulness is indebted to all the members in Neurolinguistics Lab at NTNU, Elvis, Gracie, Jeff, Julia, Ken, Matt and Vivi, for their great help in carrying out the ERP experiment and their valuable advice from time to time. Importantly, I am grateful for their considerate company when I was struggling with the thesis. I truly appreciate the chance for being able to learn and work with such lovely partners in this lab. Finally and most importantly, I owe my heartfelt gratitude to my beloved family. Millions of thanks to my mom and dad for their always being supportive and most confident in me, and also to my sister and brother for always being my backup. Their love is no doubt. vi.

(8) the greatest source of my courage that drives me to keep on. This thesis is dedicated to my dearest family.. vii.

(9) Table of Contents 摘要............................................................................................................................................. i Abstract .................................................................................................................................... iii Acknowledgements .................................................................................................................... v Table of Contents ................................................................................................................... viii List of Tables .............................................................................................................................. x List of Figures ........................................................................................................................... xi Chapter One. Introduction ......................................................................................................... 1 1.1 Motivation .................................................................................................................... 1 1.2 Research Questions ...................................................................................................... 2 1.3 Significance of the Study ............................................................................................. 2 Chapter Two. Literature Review ................................................................................................ 5 2.1 Event-related Potentials (ERPs) and N400 .................................................................. 5 2.2 N400 in Visual Experiments ........................................................................................ 8 2.3 N400 in Auditory Experiments .................................................................................. 10 2.4 N400 in Cross-modal Experiments ............................................................................ 15 2.5 N400 and Lyrics Processing....................................................................................... 19 2.6 N400 and Familiarity ................................................................................................. 23 2.7 N400 and Language Processing Strategies ................................................................ 25 2.8 Summary of Chapter Two .......................................................................................... 27 Chapter Three. Methodology ................................................................................................... 29 3.1 Subjects ...................................................................................................................... 29 3.2 Materials .................................................................................................................... 29 3.3 Procedure ................................................................................................................... 38 3.4 Behavioral and EEG recording .................................................................................. 41 3.5 Data analysis .............................................................................................................. 41 Chapter Four. Results ............................................................................................................... 45 4.1 Behavioral Data ......................................................................................................... 45 4.2 ERP Data .................................................................................................................... 49 Chapter Five. Discussion ......................................................................................................... 57 Chapter Six. Conclusion .......................................................................................................... 66 6.1 Summary of the Current Study .................................................................................. 66 6.2 Implication and Future Direction ............................................................................... 67 References ................................................................................................................................ 69 Appendix I. The summary of the character number, lyrics length, character to lyrics length ratio and cloze probability of the 120 song excerpts used in the present study. (Only the viii.

(10) unfamiliar song excerpts were measured for cloze probability.) ............................................. 79 Appendix II. The summary of the 120 primes, the 240 targets (related vs. unrelated) used in the experiment and the word frequency of all the targets. ....................................................... 94 Appendix III. Experimental Instruction ................................................................................. 101. ix.

(11) List of Tables Table 1. The numbers of fast and slow song excerpts in the two experimental conditions. .... 33 Table 2. Example materials in the experiment. ........................................................................ 35 Table 3. The numbers of the experimental materials in each experimental condition............. 35 Table 4. The summary of the statistical results for the pilot tests: song lyrics materials. ........ 37 Table 5. The summary of the statistical results for the pilot tests: targets. .............................. 37 Table 6. The summary on the four versions of trials with lyrics stimuli counterbalanced across the related and unrelated conditions. ....................................................................................... 38 Table 7. Behavioral data .......................................................................................................... 45 Table 8. The summary of the follow-up tests on the RT after a two-way interaction of relatedness and concreteness.................................................................................................... 47 Table 9. The summary of the follow-up tests on the error rate after a two-way interaction of relatedness and concreteness.................................................................................................... 49 Table 10. The summary of the follow-up tests on the N400 mean amplitudes after a three-way interaction of relatedness, familiarity and hemisphere. ........................................................... 53. x.

(12) List of Figures Figure 1. Procedure of stimuli presentation. ............................................................................ 40 Figure 2. Grand averaged ERP waveforms at 9 representative channels. ............................... 51 Figure 3. ERP waveforms in the familiar lyrics processing at 9 representative channels. ...... 54 Figure 4. ERP waveforms in the unfamiliar lyrics processing at 9 representative channels. .. 55. xi.

(13) Chapter One. Introduction. 1.1 Motivation Language processing, whether in the form of spoken, written words or signed words, plays an important role in people’s life. In order to achieve a successful communication, semantic processing is no doubt of great account in grasping meanings being conveyed. On semantic processing, it has always been an intriguing issue as to see when people start to initiate this language processing. Indeed, several studies have investigated semantic processing in language comprehension with the technique of event-related potentials (ERPs) (e.g. Kutas and Hillyard, 1980a, 1980b; Holcomb and Neville, 1991). Due to its excellent temporal resolution, the ERP technique has become a useful tool in examining online processing (Luck, 2005; Osterhout, McLaughlin, & Bersick, 1997). As reported by previous studies, including written language and spoken language, an ERP component named N400 was found related to linguistic semantic processing. However, studies on sung words (i.e. song lyrics) are relatively fewer in number which leaves room for more investigation. Song lyrics are a common presentation of linguistic meaning in our daily life; nevertheless, whether people semantically process lyrics in song perception and to what 1.

(14) extent lyrics processing is carried out is a relatively less explored area. Despite that several studies have attempted to investigate lyrics processing (Besson, Faïta, Peretz, Bonnel, & Requin, 1998; Bonnel, Faita, Peretz, & Besson, 2001; Gordon, Schön, Magne, Astésano, & Besson, 2010), their discussion focused mainly on the relationship between lyrics and tune processing in song perception (i.e. whether the two kinds of processing are independent from each other or integrated), and thus the issue of whether lyrics are semantically processed in song perception, as in regular language, remained untouched. Therefore, the present study is motivated to carry out more specific examination on lyrics processing to see whether song lyrics are processed at a semantic level with the ERP technique.. 1.2 Research Questions The research questions for the present study are twofold: 1. Do people process lyrics at a semantic level when listening to songs? 2. If they do, would song familiarity be influential in lyrics processing? The first question addresses if people engage in semantic processing when listening to song lyrics as the way spoken language is processed. The second question examines if there is any discrepancy in lyrics processing when people perceive familiar and unfamiliar songs.. 1.3 Significance of the Study To date most studies have been predominantly discussing the relationship between lyrics 2.

(15) and music processing, but the issue of lyrics processing in song perception, especially with regard to semantic processing, is not well-addressed due to the following reasons. First, in past literature, to test whether subjects process lyrics, investigators usually manipulated the congruency effect (e.g. whether the target words are semantically anomalous based on the contextual information), and thus they usually asked subjects to pay attention to the meaning of lyrics, which might be a methodological confound because the observed “semantic processing” might simply be a task-induced effect. Also, the evidence for the semantic processing in lyrics perception might not be fully supported when the experiment was simply testing whether the target was (in)congruous with the context without considering other types of experimental paradigms. Finally, the manner of lyrics presentation in previous studies is usually unnatural; for instance, researchers had the lyrics stimuli sung without musical instrument accompanied, while it is not as natural as the way people usually listen to songs. This current study takes all the above issues into account. To simultaneously address the methodological (i.e. the observed semantic processing may be a task-induced effect) and the semantic processing issues, a semantic priming paradigm, instead of the traditional semantic congruency paradigm, was used in which the semantic relatedness between the sentence-final word (i.e. the word at the final position of the selected line(s) of the song lyrics) and the target word was manipulated to test the semantic processing of the sentence-final word. Then, instead of using the frequently used congruency decision task, a word concreteness task was 3.

(16) used in this study to avoid intentional instruction for subjects’ paying attention to the semantic aspect of the lyrics. Finally, to address the “unnaturalness” issue of song presentation, the materials in the current study were excerpts of songs, making it much more authentic as how people generally perceive lyrics when listening to songs. With these experimental attempts, this study is hoping to contribute to the existing research on song processing with further observation and shed some light on the issue of semantic processing in lyrics perception. Finally, it is a common phenomenon that many people listen to music while having their readings or studies. Intriguingly, some have claimed that they are involuntarily distracted by the lyrics even if they have tried hard to focus themselves on the work at hand. With the online examination on the semantic process in lyrics perception, this study can also provide more insights in the discussion regarding the effect of interference caused by music on reading efficacy when the two tasks are performed concurrently.. 4.

(17) Chapter Two. Literature Review. In chapter two, studies on the N400 component are reviewed. Section 2.1 introduces the overview of event-related potentials (ERPs) and N400. In sections 2.2 and 2.3, studies on N400 with visual stimuli and auditory stimuli are summarized respectively. In section 2.4, cross-modal experiments on N400 are discussed. Section 2.5 reviews previous studies on N400 and lyrics processing, followed by section 2.6 introducing previous findings on N400 and familiarity. Section 2.7 describes earlier experiments on the N400 effect and language processing strategies. Finally, section 2.8 summarizes this chapter.. 2.1 Event-related Potentials (ERPs) and N400 In early 20th century, scientists found that the electrical activity of human brain could be measured by placing electrodes on the scalp (Adrian & Matthews, 1934; Berger, 1929; Luck, 2005). By the connection to an amplifier, the variation in voltage over time could be revealed, which is called electroencephalogram, or EEG (Berger, 1929; Coles & Rugg, 1995; Luck, 2005). With the representation of hundreds of brain activities in the EEG, it is difficult to examine the relationship between the neural electricity and human psychological processes. Hence, by averaging the small voltage in the EEG, the event-related potentials (ERPs) could 5.

(18) be extracted. ERPs are defined as the post-synaptic activity of neurons summed in the EEG and time locked to the stimuli within an epoch, enabling further interpretation on the neural response to special events (e.g. sensory, cognitive or motor events) (Coles & Rugg, 1995). Description and discussion on the ERP results could refer to its components, which are determined by its polarity, amplitude, peak latency and scalp distribution (Osterhout et al., 1997). As reported by Osterhout et al. (1997), ERPs are able to provide the record of electrical activity by millisecond. Therefore, ERP results could uncover the covert parts of online processing (e.g. cognitive or linguistic processing) that might not be observed by behavioral results (Marta Kutas & Hillyard, 1989). Besides, ERPs could be recorded even without asking subjects to perform at a level of conscious awareness. In terms of the language-related components of ERPs, a few components have been widely-studied in the literature, such as mismatch negativity (MMN), N400 and P600. These ERP components are respectively related to different kinds of processing1 (Duncan et al., 2009; Luck, 2005). Due to the fact that semantic processing is the main focus of the present study, the N400 component would be specifically introduced as follows. N400 is a negative-going wave, typically beginning at around 200-250 ms post-stimulus onset and peaking at around 400 ms. It is usually largest over central and parietal electrode. 1. Mismatch negativity (MMN) is elicited by any discriminable changes in auditory stimuli (Näätänen & Alho, 1997; Shtyrov, Kujala, Palva, Ilmoniemi, & Näätänen, 2000), and P600 is related to syntactic violations (Osterhout & Holcomb, 1992; Osterhout, Holcomb, & Swinney, 1994). For N400, it is related to semantic processing and it will be discussed in the following paragraph. 6.

(19) sites, and its amplitude is slightly larger over the right hemisphere than the left hemisphere (Duncan et al., 2009; Marta Kutas & Federmeier, 2011; Luck, 2005). The N400 component was first observed in Kutas and Hillyard’s (1980b) study. According to their experimental results, N400 was found sensitive to semantic anomaly, which causes semantic reprocessing upon encountering unexpected lexical information. Accordingly, the amplitude of N400 is attributed to the difficulty in the process of contextual integration. That is, the more difficult the processing, the larger the amplitude of N400. Later studies further showed that word expectancy2 is even much more crucial in influencing the N400 effect, while semantic anomaly is not the necessary condition (Marta Kutas & Hillyard, 1984a, 1984b). N400 could be elicited within word pairs (e.g. semantic priming effect in prime-target paradigm) (Bentin, McCarthy, & Wood, 1985; Koriat, 1981; Marta Kutas & Hillyard, 1989), sentences, and discourses (Marta Kutas & Federmeier, 2000, 2011). Besides, N400 has been commonly found in experiments concerning different modalities, such as in visual, auditory materials or even cross-modal experiments (Domalski, Smith, & Halgren, 1991). Regardless of within-modality or cross-modality, the elicitation of N400 is shown with temporal and functional similarity. This observation is therefore taken as additional credence to that semantic knowledge is accessible from different forms of input. More details about the N400 effect illustrated by previous studies are summarized below.. 2. Word expectancy factor in studies on N400 is referring to word cloze probability--the proportion of a certain word that people would choose to complete a sentence. 7.

(20) 2.2 N400 in Visual Experiments Kutas and Hillyard (1980b) first reported the N400 effect in their study on semantic anomalies. In their experiment, they manipulated the sentence-final words to be either semantically inappropriate or physically deviant (i.e. larger size of word). For the semantically inappropriate manipulation, they further constructed two conditions based on the degree of semantic violation: moderate and strong semantic violation. The results showed that both moderate and strong semantically inappropriate target words elicited larger N400, while N400 was largest in strong semantic violation condition. This N400 effect was found to begin at about 250 ms and to peak at about 400 ms with a centro-parietal distribution. Different from N400, they found P300 elicited by the physically deviant word. Thus, they suggested that the N400 effect is the indicator of reprocessing on semantic anomaly. In a later study, Kutas and Hillyard (1980a) used both the congruency of word meaning and word size as independent variables to make comparisons on the ERP results in either condition or both. The results were consistent with their earlier observation, showing that the N400 effect would be elicited by semantic violation, while the P300 effect would be elicited by physical deviancy. Either deviation was not to alter the other deviation. Also, the pattern of N400 found in their study was similar to that in their earlier experiment by its onset, peak latency and the centro-parietal distribution. Moreover, when the target words were semantically incongruous and in larger size, both N400 and P300 were elicited. Therefore, the N400 effect 8.

(21) was considered different from the P300 effect that it is sensitive to the processing of semantic anomalies. More studies were following to further examine the N400 effect and semantic processing. Kutas and Hillyard (1984a, 1984b) designed their experiments by controlling the degree of contextual constraints and cloze probability of eliciting words. Contextual constraints refer to the degree of anticipation for an upcoming word as developed by the provided contextual information, while cloze probability refers to the percentage of a certain word people use to complete a sentence. It is noteworthy that the two variables are not mutually independent from each other since higher contextual constraints would lead to higher cloze probability. For example, in the sentence “He mailed the letter without a stamp.” (Marta Kutas & Hillyard, 1984a), it is clear that a specific word is highly expected for the underlined position based on the contextual constraints, and that most people would choose “stamp” to finish this sentence. The results illustrated that the N400 effect, largest over the parietal and the right posterior electrode sites of the scalp, was more sensitive to cloze probability than the degree of contextual constraints as the amplitude of N400 systematically declined when the cloze probability increased. This showed that subjects’ word expectancy significantly affected the way in their sentence processing, while semantic anomaly was not the necessary condition for the elicitation of N400. In order to see the influence of context constraints on word recognition, they also controlled the experimental stimuli in terms of the 9.

(22) degree of semantic relatedness to the best completion of the sentences. The results showed that the amplitude of N400 was reduced when the eliciting words were unexpected but semantically related to the best completion regardless of semantic anomalies if any. To briefly summarize their findings, it is reported that stronger contextual constraints set up stronger word expectancy (i.e. cloze probability), and this priming effect is not only for the best completion but also for the words semantically associated with the best completion (Marta Kutas & Hillyard, 1984a, 1984b).. 2.3 N400 in Auditory Experiments The studies reviewed in section 2.2 all used visual materials. There are also studies employing auditory stimuli to investigate the N400 effect and to see if there is difference in the ERP results between written and spoken language processing. For example, Holcomb and Neville (1991) conducted a study on the N400 effect by using natural speech as stimuli. In their experiments, there were three conditions: best completion, semantically related anomalies and semantically unrelated anomalies. They used 135 English sentences, and controlled the anomalous target words so that they did not share the same initial phoneme as the best completions. Subjects were required to decide if the spoken sentence stimuli made sense. The results were in agreement with previous visual studies showing the N400 effect in both related and unrelated anomalies, with a relatively larger effect in unrelated condition. The N400 effect was found to be largest over occipital, Wernicke's and posterior sites, with 10.

(23) the left hemisphere more negative than the right. It was especially noted that the ERPs difference between the best completion and the unrelated anomaly appeared quite early at around 50-100 ms, and they considered it as the early onset of N400 effect. This earlier onset of N400 effect found in auditory stimuli clearly pointed out the discrepancy in semantic processing between visual modality and auditory modality. Further, to see if there was any difference between naturally connected speech and speech with words spoken in isolation, they extended the experiment by adding a 750-ms inter-stimulus interval (ISI) as independent variable. However, there was no early onset of N400 effect in this extended experiment as in natural speech stimuli. On this absence of early onset of N400, Holcomb and Neville (1991) suggested two possibilities. For one, they thought that there was an interaction between the nonsemantic between-word cues (i.e. prosody and coarticulation) and contextual information in natural speech stimuli but not in speech spliced with intervals. This interaction enabled the information about the final words to be provided rapidly to the subjects and resulted in an earlier N400 effect accordingly. For the other, the rate of stimuli presentation was concerned that natural speech stimuli was presented at a relatively high rate. However, regarding this factor, Kutas’ (1987) visual study showed a contradictory finding in that an earlier ERP effect appeared in the experimental condition with slow stimuli presentation. Thus, Holcomb and Neville (1991) suggested that further studies were necessary to tease these factors apart. Later in Friederici et al’s (1993) study, they investigated the effects of semantic, 11.

(24) syntactic and morphosyntactic violations in natural speech processing with 160 German sentences. For semantic violation, they manipulated the experimental sentences to contain selectional restriction violation (i.e. the mismatch between the preceding noun and the sentence-final verb). A probe-verification paradigm was employed and subjects were asked to judge whether the probe words was part of the preceding sentences they just heard. As predicted, the results revealed a classical N400 pattern evoked by the semantic violation, which was broadly distributed over both hemispheres. Connolly and Phillips (1994) designed an experiment to test the relationship between Phonological Mismatch Negativity (PMN) and word expectancy developed by contextual information. For PMN, it is measured to occur at around 175-225 ms and to peak at around 25-75 ms later. In their experiment, they used 160 English sentences as auditory stimuli and constructed four conditions which have 40 sentences for each: Phoneme Mismatch-Semantic Mismatch, Phoneme Mismatch-Semantic Match, Phoneme Match-Semantic Mismatch, and Phoneme Match-Semantic Match. For semantic match, the sentence-final target word is the one with highest cloze probability for the sentence. As for the manipulation of phoneme match, the sentence-final target word shares the same initial phoneme as the word with highest cloze probability, and vice versa. The results conformed to their prediction: the elicitation of both PMN and N400 in Phoneme Mismatch-Semantic Mismatch condition, the elicitation of PMN but no N400 in Phoneme Mismatch-Semantic Match condition, N400 but 12.

(25) no PMN in Phoneme Match-Semantic Mismatch condition, and finally none of the components in Phoneme Match-Semantic Match condition. Topographically, the N400 effect was found more frontally distributed and symmetric in both hemispheres, and PMN was more evenly distributed. This frontal distribution of the N400 effect in Connolly and Phillips' (1994) experiment was obviously different from the posterior distribution pattern found in previous visual (Marta Kutas & Hillyard, 1980b, 1984b) or other auditory experiments (Holcomb & Neville, 1991). On this discrepancy, Connolly and Phillips (1994) argued that it is probably because no overtly behavioral response were required from the subjects in their auditory experiment, resulting in the inconsistent results to the previous findings regardless of modality differences. Nevertheless, they suggested that more researches are awaiting to confirm this assumption. Also, Connolly and Phillips (1994) pointed out that in Phoneme Match-Semantic Mismatch condition, the N400 effect had a delayed peak latency due to the same initial phoneme shared by the eliciting word and expected word. In other words, compared to Phoneme Mismatch-Semantic Mismatch condition, there was no separation for the expected and unexpected words at the initial phoneme in phoneme match condition. Thus, they concluded that this acoustic-phonological processing, PMN, is at the lexical selection stage where the contextual effect begins to influence. Unlike Holcomb and Neville’s (1991) study which suggested an early onset of N400, Connolly and Phillips (1994) argued an early negativity PMN elicited before N400 in auditory language processing. 13.

(26) In a more recent study, Hagoort and Brown (2000) examined natural speech processing with 120 Dutch sentences. In their experiments, target word anomalies were controlled to be at either sentence-final or sentence-medial positions. They found the N400 effect elicited in both experimental conditions, which is similar to that elicited by visual materials. The scalp distribution of this N400 effect was larger over posterior sites and slightly larger over left hemisphere than over right hemisphere, which was similar to Holcomb and Neville's (1991) findings. On this pattern of N400 different from the previous visual studies, Hagoort and Brown (2000) indicated two possibilities. For one, they suggested that there might be non-overlapping neural generators for the two different input modalities that result in such discrepancy. For the other, they reported that in fact the distribution of N400 varied in some of the previous visual studies; therefore, it is also possible that the left hemisphere preponderance of the auditory N400 is not deviant from the pattern of the visual N400. Aside from the N400 effect, Hagoort and Brown (2000) also observed an early negativity (N250) in both of the sentence-final and sentence-medial position conditions, while this early negativity was not shown in previous visual experiments. To interpret this early negativity, Hagoort and Brown (2000) considered it as an index of lexical selection, representing the mismatch between the expected word forms and the actual cohort activation of the target words. Addressing the lack of early negativity in visual stimuli, they ascribed it to the fact that the lexical information of the target words was available upon presentation in 14.

(27) visual materials. Thus, the early negativity components would not be standardly seen as in the condition with visual stimuli. With the two different negativity effects shown in their experiments, their study was different from previous research that either showed no early negativity (Friederici et al., 1993) or an earlier onset of the N400 effect in speech processing (Holcomb & Neville, 1991). In spite of different interpretations on the earlier negativity, Hagoort and Brown’s (2000) finding could be compared with PMN discussed by Connolly and Phillips (1994). Even though the two negativity components were not measured to be in the same latency range, they were both considered functionally independent from the N400 effect (i.e. not just as an early onset of the N400 effect), and were at the stage of lexical selection. In addition, the early negativity components were only found in auditory rather than visual materials. Finally, the finding of two negativity components in both Connolly and Phillips’ (1994) and Hagoort and Brown’s (2000) studies evidently indicates a discrepancy between the processing in auditory and visual stimuli, with the latter representing the monophasic negativity.. 2.4 N400 in Cross-modal Experiments Except for within-modality experiments, cross-modality experiments have been designed to explore the N400 effect as well. Holcomb and Anderson (1993) constructed cross-modal semantic priming experiments to study the interaction of word processing between visual and auditory modalities. There were two experiments differing in the order of 15.

(28) stimuli modality in their study, one with visual primes paired with auditory targets, and the other with auditory primes paired with visual targets. They manipulated the stimulus-onset asynchrony (SOA) in the experiment so that the targets would show up with 0-ms SOA, 200-ms SOA or 800-ms SOA. English-speaking subjects were recruited for each experiment and instructed to perform a lexical decision task, judging if the target word was a real word. The results in the first experiment (visual primes, auditory targets) showed that a large semantic priming effect was found across the three SOA conditions in both behavioral and ERP results: subjects’ reaction time for the related target words was much reduced than that for the unrelated target words, and larger N400 was observed in the unrelated target condition. The N400 peaked at around 400-500 ms, was largest at more anterior sites and was followed by a late positivity (P3) at more posterior sites. This finding was, however, not replicated in the second experiment (auditory primes, visual targets), in which behavioral semantic priming effect was still significant across the three SOA conditions, but the N400 effect was found only in the 200-ms SOA and 800-ms SOA conditions. This N400 peaked at around 350-400 ms, which was a bit earlier than in the first experiment. Besides, this N400 was widely distributed and was also followed by a late positivity (P3) at posterior sites. Overall, the two experiments showed that the N400 effect was significantly larger at posterior sites and was bilaterally symmetrically distributed, and that the N400 effect in the first experiment was greater than in the second experiment. With regard to the different results in the 0-ms 16.

(29) SOA conditions in the two experiments, two views were considered as possible accounts. The first one was the “conversion view”, suggesting that more time is needed to convert the auditory words to visual code, resulting in the delayed onset of the auditory/visual priming effect. The second view was “the common semantic system hypothesis”, which offered two explanations. First, the amount of time for accessing information from auditory words and visual words was different: visual words require relatively less time since the information is available upon presentation. Therefore, the absence of priming effect in 0-ms SOA condition in auditory/visual experiment may be attributed to the fact that insufficient information of auditory words were processed by the subjects before they processed the visual targets. Second, the attentional mechanisms might play a role for some studies have described that an attentional bias for visual stimuli was found but not for auditory stimuli. When there is a competition between the two modalities in processing, auditory stimuli are considered to receive less attention. According to this argument, the subjects might have paid less attention to the auditory primes in 0-ms SOA condition that caused no significant priming effect. Albeit the debate from the two different hypotheses, Holcomb and Anderson (1993) indicated that the common semantic system hypothesis was supported in cross-modal paradigm by having more favoring evidence. In other words, the semantic system is amodal, and the cross-modal semantic priming paradigm is to have input from two modal-specific recognition systems. 17.

(30) Another study concerning cross-modal priming was conducted by Scharinger and Felder (2011). Different from the general priming paradigms, they manipulated the prime words to be fragments so that subjects only heard the initial syllables but not the full words. This way, the researchers aimed at investigating if there was a difference between form-related and meaning-related priming. In their experiment, three different kinds of prime fragments were designed with regard to their relatedness to the targets: semantically related, phonologically related, and semantically and phonologically unrelated. Three hundred German disyllabic words were selected for experimental stimuli. The primes were presented aurally and targets visually with 500-ms ISI, and the subjects were asked to decide if the target words were real German nouns. The ERP results showed that there was an early negativity (N180) found in both phonologically related and semantically related conditions but not in the incongruent one. This negativity was seen as the index of pre-activation of the targets due to the semantic context effect. Besides, they found a P350 effect in phonologically related condition. According to Scharinger and Felder’s (2011) review on previous studies, this positive component is distinctively found in fragment priming experiments, referring to lexical access and lexical selection. For N400, surprisingly, the results were rather inconsistent to previous studies in terms of priming effect. The N400 effect elicited in semantically related condition was found larger than that in either phonological or incongruent condition. On the other hand, the N400 effect was larger in incongruent condition than phonologically related condition, 18.

(31) which conformed to a typical N400 pattern. Regarding the different N400 effect, Scharinger and Felder (2011) offered some explanations. First, the phonological related primes might have the subjects develop phonological expectancy toward the targets, resulting in mismatch in semantically related and incongruent conditions that elicited larger N400. Second, the repetition effect on the N400 component might be influential since the subjects encountered the semantically related pairs twice based on the experimental design. Last, they considered it possible that only stimuli being fully presented could elicit N400. This way, the absence of traditional N400 pattern in their study could be attributed to the fragment priming paradigm. However, they also suggested that further studies were needed to confirm this claim. In summary, with a semantic fragment priming assessment and the elicitation of N180, P350 and N400 components, Scharinger and Felder’s (2011) study depicted that the form-based and meaning-based processing were probably separate representations. Despite the fact that the central issues addressed by the two cross-modal studies discussed above are not identical, both studies are commonly related to semantic processing that involves the discussion of N400 effect. Thus, it is established that N400 could be elicited not only in within-modal but also in cross-modal experiments.. 2.5 N400 and Lyrics Processing Besson et al. (1998) studied the relationship between song lyrics and tune processing with the ERP technique to see whether the two kinds of processing were separated or 19.

(32) integrated. They used 200 brief excerpts of best known French operatic songs and manipulated the word at lyric-final position to be semantically anomalous, harmonic anomalous (i.e. words sung out of key) or both. Hence, four experimental conditions were constructed: semantically congruous and sung in key, semantically incongruous and sung in key, semantically congruous and sung out of key and semantically incongruous and sung out of key. They asked subjects, who were all musicians, to attend to the stimuli and to detect semantic and harmonic incongruities. By recording subjects’ ERPs, they found a widespread N400 effect elicited by semantic anomalies and a parietal distributed P300 effect by harmonic anomalies. As for the condition with semantically incongruous and sung out of key, both N400 and P300 were observed, though with smaller amplitude compared to the conditions with either one type of anomaly. This finding indicated two significances. First, lyrics processing would elicit N400 in the same manner as speech processing, suggesting that the meaning of lyrics were not affected by the musical structures imposed on it. Second, with different components elicited, lyrics and tunes processing were considered to be independent of each other. In an extended behavioral study, Bonnel et al. (2001) investigated the same issue but with different methodology. Using the same corpus of French operatic song materials, they designed two tasks, a single task and a dual task, which also involved four conditions: correct version, semantic anomaly, tune anomaly and both types of anomalies. In the single task, 20.

(33) subjects were divided into two groups and required to detect the anomaly in either language or music condition. While in the dual task, all subjects needed to pay attention to both language and music dimensions as to judge whether there is anomaly of each type. With this methodology, the researchers could see how subjects’ attention was affected and distributed in the single task and dual task. According to the results, lyrics and tunes were considered to be processed independently since no deficit in subjects’ performance was observed in the dual task. In other words, subjects were able to divide their attention to lyrics and tunes and to perceive the two dimensions separately. Also, by recruiting singers, instrumentalists and non-musicians as subjects and later comparing these groups’ performance, the musical expertise was found independent of the authors’ findings. Thus, despite that ERPs were not employed as measurement in this study, the behavioral results supported Besson et al’s (1998) finding that showed the independence of lyrics processing. Another study examining the processing of tunes and lyrics was carried out by Gordon and her colleagues (Gordon et al., 2010). Taking a different position, they argued that sung words (i.e. lyrics) were processed interactively with melodies in songs. Employing a prime-target word pair paradigm, they designed four experimental conditions based on same-different task with tri-syllable French words: same word-same melody, same word-different melody, different word-same melody and different word-different melody. Similarly, subjects, who were non-musicians, were instructed to decide whether the pairs of 21.

(34) words or melodies were the same by performing either the linguistic or the musical task. As expected, larger N400 was found when the prime and target words were different. To their surprise, a similar N400 effect was also observed when melodies were different in prime-target pair, which had similar onset latency but was smaller in amplitude compared to a classical pattern of N400. In addition, the N400 found in the musical task was followed by a late positivity. Topographically, the N400 effect was larger over centro-parietal sites in linguistic task and larger over parietal sites in musical task, and it showed a slightly larger right hemisphere predominance in both tasks. On interpreting the N400 effect found in the musical task, Gordon et al. (2010) attributed it to the automatic processing of sung word meaning regardless of the direction of attention. With the elicitation of N400 effect in musical task, it was thus asserted that there was an interaction between lyrics and melody processing. This finding, though in contrast to previous studies (Besson et al., 1998; Bonnel et al., 2001), still could be taken as one more piece of evidence supporting the similarity in the processing of spoken language and lyrics. To briefly sum up, all the three studies illustrated above discussed the issue of lyrics processing. Even though they reached different conclusions in terms of the relationship between lyrics processing and tune processing, their results all suggested that lyrics were processed as speech that would elicit N400 when semantic anomaly or semantic unexpectancy appeared. However, the paradigms used in these studies might be a 22.

(35) confounding factor in addressing lyrics processing because they all directed subjects’ attention to one specific task (e.g. linguistic or musical task), which might have had subjects intentionally attend to the semantic meaning of lyrics when performing the task. Therefore, the observed semantic processing of lyrics might be task-specific, and whether people semantically process lyrics in regular song perception is still unknown. What’s more, the manner of lyrics stimuli presentation in the studies is to have the songs sung a cappella (i.e. without musical instruments accompanied). Though the way of presentation enables listeners to recognize that the auditory stimuli are songs, it is not as “authentic” as how people generally perceive lyrics when listening to songs. Regarding the two possible confounding factors (i.e. paradigm and stimulus presentation) in these studies on lyrics processing, the current study therefore carried out a different experimental design by employing a semantic priming paradigm and presenting lyrics naturally as excerpts of songs, as will be discussed in Chapter 3.. 2.6 N400 and Familiarity To respond to the second research question of the present study, "Would song familiarity be influential in lyrics processing?", this section would address some results on N400 and familiarity in literature. Two studies have scrutinized the processing of familiar materials. Marton and Szirtes (1988) utilized 156 familiar Hungarian proverbs and manipulated the word congruency at the proverb-final position. They examined the participants’ processing of 23.

(36) the proverbs by studying saccade-related potentials3 (SRPs), and they found a classical pattern of N400 elicited by the incorrect proverb-final words. Moreover, they observed an elicitation of early positivity (P220) in both correct and incorrect conditions, suggesting that this early positivity serves as an index of lexical access to the target word. In a similar vein, Liu et al. (2011) used 240 familiar Chinese poems of Tang Dynasty in their ERP experiment. The manipulation on incorrect target words included two different conditions: homophonic and synonymous conditions. The results showed that larger amplitude of N400 was elicited in homophonic condition due to semantic violation, which was distributed in the centro-parietal sites with no hemispheric difference. This N400 component was not elicited by synonymous words since there was no semantic violation in such condition. Aside from the N400 effect as expected, they also found a late positivity in both incorrect conditions, with larger amplitude in the synonymous condition. On this late positive shift, Liu et al. (2011) interpreted it as the index of reanalysis when there was a strong conflict between expected and unexpected stimuli. To sum up, both Marton and Szirtes' (1988) and Liu et al.'s (2011) studies employing familiar materials similarly indicated a classical N400 effect as in previous visual experiments. Hence, it was suggested that when the subjects were familiar with the materials 3. Studies on saccade-related brain potentials (SRPs) examine brain neural responses elicited by the stimuli where subjects will need to make a saccade in order to perceive the stimuli (Magda Marton, Szirtes, & Breuer, 1985). The saccade components being investigated will be observed starting from the saccade onset, which could be described in terms of its polarity and latency. Some experiments have shown the similarity in the factors influencing the late components in SRPs and ERPs, such as the frequency of stimuli (M. Marton, Szirtes, & Breuer, 1984) or the complexity of word categorization (Magda Marton, Szirtes, Donauer, & Breuer, 1985). 24.

(37) to a great extent, they would still conduct semantic processing on the encountered stimuli.. 2.7 N400 and Language Processing Strategies After reviewing the N400 literature on lyrics processing and familiar materials, this section further looks into how N400 is related to language processing strategies. In the past literature, two language processing strategies, prediction and integration, have been investigated in some ERP studies with regard to their mechanisms and their relationship with the two hemispheres (Kara D. Federmeier, 2007; Kara D. Federmeier & Kutas, 1999; K. D. Federmeier, Wlotko, De Ochoa-Dewald, & Kutas, 2007). For the “prediction” account, it was termed that the left hemisphere helps to preactivate the semantic features of the target which fits the best to the context. In a different fashion, the “integration” account was termed that the right hemisphere is to compare the semantic features of the targets directly with the other semantic features in the context. In Federmeier and Kutas' (1999) study, they investigated the hemispheric differences and the two processing strategies by manipulating the stimuli presentation in either of the half visual fields (i.e. left visual field and right visual field) or in both. By involving the manipulation on the stimuli appearing in the half visual field, it was to see how the two hemispheres might have worked differently in language processing. In their experiment using pairs of sentences as stimuli, three different types of targets were constructed. in. the. second. sentence:. expected,. within-category. unexpected. and. between-category unexpected. The ERP results showed that both types of unexpected targets 25.

(38) elicited the N400 effect when the stimuli were presented to both visual fields (i.e. at the central position of the computer screen involving both visual fields), with the between-category unexpected targets eliciting a larger effect. However, when the variable of visual field was involved, the patterns of the N400 effect in the two hemispheres were not identical. The ERP data with the stimuli presented to the subjects' right visual field (i.e. processed by the left hemisphere) showed that both types of unexpected targets elicited larger N400 amplitudes than the expected targets, and the between-category targets elicited larger effect than the within-category unexpected targets. On the contrary, the data with the stimuli presented to the left visual field (i.e. processed by the right hemisphere) showed that both types of unexpected targets elicited larger N400 effect than the expected targets, but there was no difference in N400 effect between these two types of targets. With the evidence shown by the scalp distribution of N400, Federmeier and Kutas (1999) concluded that the left hemisphere is for prediction and the right hemisphere is for integration in language processing. In a later auditory experiment using the same materials, Federmeier et al. (2002) found the N400 effect similar to the pattern observed in the earlier visual experiment, replicating the results in terms of the dominant hemisphere in the two language processing strategies respectively. To sum up, earlier studies have suggested that in language processing, there are two types of processing strategies, one is prediction and the other is integration. As illustrated in 26.

(39) some experiments investigating the relationship between the hemispheres and the two language processing strategies, the left hemisphere is related to the predictive strategy, that is, in a more top-down fashion. On the opposite, the right hemisphere is related to the integrative strategy, showing a bottom-up tendency in language processing.. 2.8 Summary of Chapter Two This chapter introduces the background of ERPs and N400 and discusses some studies on N400 and linguistic semantic processing with visual, auditory and cross-modal experimental stimuli, together with the reviews of N400 and lyrics processing and familiarity, respectively. To begin with, section 2.1 illustrates the general information about the ERP technique, the trend of using this technique in examining online processing, and the characteristics of the ERP component N400. Section 2.2 summarizes the results in visual experiments, showing that the N400 effect would be influenced by several factors in terms of semantic manipulation, such as semantic anomaly, contextual constraints, cloze probability and semantic relatedness. Section 2.3 indicates that the results in auditory experiments are consistent to that in visual experiments, except the observation that the N400 effect in spoken language processing appears earlier and was found larger in left hemisphere in some studies. In section 2.4, the studies show that N400 is also elicited in cross-modal experiments, sharing similarity in peak latency and scalp distribution to within-modal experiments despite that a few differences are found when the order of stimuli modality (i.e. auditory-visual pair vs. 27.

(40) visual-auditory pair) differs. In section 2.5, the reviews of some studies on the relationship between tune and lyrics processing reveal that lyrics are processed the same way as spoken language by observing the N400 effect elicited by semantic violation or unexpectancy manipulated on lyrics. Some methodological limitations in these studies are discussed as well. Section 2.6 regards processing of familiarity by reviewing studies using familiar experimental materials. The results with the observation of a typical pattern of N400 effect indicate that the semantic processing is similarly carried out even if the subjects are familiar with the materials to a great degree. Finally, section 2.7 reports the N400 effect in terms of the two language processing strategies, prediction and integration, suggesting that the left hemisphere is responsible for predictive processing, while the right hemisphere is involved in integrative processing.. 28.

(41) Chapter Three. Methodology. Chapter three introduces the experimental design in the current study. Section 3.1 describes the subjects participating in the experiment. Sections 3.2 and 3.3 summarize the experimental materials and the procedure respectively. Settings in behavioral and EEG recording are reported in section 3.4. Finally, section 3.5 illustrates the procedures in data analysis.. 3.1 Subjects Twenty-three Chinese native speakers (20 to 32 years old, mean age 23, 14 females) were recruited for the experiment. All the subjects had no musical training background but kept a habit of listening to Chinese pop music by self-report. They were all right-handers according to the evaluation of a simplified version of the Edinburg handedness inventory (Oldfield, 1971). None of the subjects had known hearing, reading and neurological problems nor did they use any medication before the experiment. Informed consent was obtained prior to the formal experiment. All subjects were paid when finishing the experiment.. 3.2 Materials One hundred and twenty excerpts of Chinese pop song lyrics were used in this 29.

(42) experiment and were divided into two conditions, familiar and unfamiliar songs. Familiar songs were collected based on the yearly billboards constructed by KKBOX4, the biggest and most used online music database in Taiwan. The ranking of the billboards was based on how many times the songs were listened to by the online members; therefore, the billboards were assumed to present most familiar songs to the general public. As for unfamiliar songs, they were chosen from non-mainstream Chinese albums released in Taiwan that were supposed to be relatively less known by most audience. In terms of the lyrics stimuli, the first or first two lines in the chorus part of a familiar song that conveyed complete sense and ended with a noun were extracted since usually the line(s) is/are regarded as the most familiar line(s) to people. The same manner of extracting lyrics materials was also applied to unfamiliar song condition. The chosen line(s) were limited to 25 characters to control for the length of all the lyrics stimuli and still to provide enough contextual information for processing. Take one song called “Not That Easy (沒那麼簡單)” by a female singer, Xiao-Hu Huang (黃小琥), for example. The first two lines in the chorus part which are within 25 characters and also making complete sense are:. 4. People who join KKBOX to become members would be able to listen to songs collected in this database. In the yearly billboard, the songs mostly listened by members of KKBOX in that year would be ranked from 1 to 100, except that there are only 10 songs ranked in 2007 and 20 songs ranked in 2008. In KKBOX, there are different types of billboard categorized by its language (e.g. Chinese, English, Taiwanese, Korean and Japanese songs, etc.) or the music genre (e.g. Jazz, Rock n’ roll, electronic and classical music, etc.). In this study, the five yearly billboards from 2007 to 2011 were used for song selection. 30.

(43) 相愛沒有那麼容易，每個人有他的脾氣。 (It is not that easy to love each other because everyone has his/her characteristics.). Based on the aforementioned rationale for lyrics selection, 127 excerpts of familiar songs and 129 excerpts of unfamiliar songs were first chosen. A pilot test on song familiarity was conducted to further ensure the degree of familiarity in both familiar and unfamiliar song conditions. The 256 song excerpts from both song conditions were equally separated into four questionnaires5 for subjects to rate. For each questionnaire, 30 Chinese-speaking subjects, who reported to have a habit of listening to Chinese pop music, participated to do the song familiarity rating. The subjects were asked to indicate their familiarity toward the song excerpts on a 5-point scale (5: most familiar, 1: least familiar). Statistical analysis on the results showed that there was a significant difference in familiarity between the chosen familiar and unfamiliar songs (t(126)=13.62, p<.001). In addition to the degree of familiarity, the factor of cloze probability was also taken into consideration. It was assumed that the cloze probability of the sentence-final words in familiar songs were high, i.e. people would commonly use the same words to finish the lyrics since they were familiar with the songs. Thus, the cloze probability of the sentence-final words in unfamiliar songs was measured to see if there was a possibility to match the high. 5. There were 64 song excerpts in three questionnaires and 63 song excerpts in one questionnaire. All the songs had been randomized before they were displayed to the subjects. 31.

(44) cloze probability in familiar songs. To make this control, a pilot questionnaire on cloze probability of unfamiliar lyrics was designed with 194 excerpts of unfamiliar lyrics 6. To reduce subjects' load in answering the questionnaire, the 194 unfamiliar lyrics were randomized and equally divided into two questionnaires with the sentence-final position left blank. For each questionnaire, 20 Chinese native speakers were recruited to fill in the sentence-final position to complete the unfamiliar lyrics. The results showed a wide range of cloze probability distribution: only 3% of the lyrics completions had a cloze probability of more than 70% and almost half of the completions of the lyrics had zero cloze probability. As it became difficult to make the same control on the cloze probability between familiar and unfamiliar song conditions, only those unfamiliar lyrics whose completions had an at least 5% cloze probability (i.e. at least one subject filled in the blank with the expected word) were further selected for experimental materials. The overall cloze probability for the unfamiliar lyrics was 31%. In addition to song familiarity and cloze probability, the number of fast and slow songs used in both song conditions was also controlled for by calculating the ratio of the number of characters to the length of the song excerpt. In the end, a total of 120 song excerpts were selected: 60 familiar and 60 unfamiliar songs, with each familiarity category containing 35 slow (about 1 character per second) and 25 fast (about 2 characters per second) songs (see 6. Among the 194 unfamiliar lyrics, some of the lyrics were in fact from the same song but different chorus part in that song. This way, the songs would not be deleted due to prime repetition by comparing its first chorus part only. It was also to increase the variety of primes. 32.

(45) Table 1). No sentence-final words were repeated in the selected lyrics materials. No significant difference was found in terms of character number (t(59)=1.83, p=.072) and character to lyrics length ratio (t(59)= -1.51, p=.135) across the familiar and unfamiliar conditions. (See Appendix I for the details of the 120 lyrics stimuli.). Table 1. The numbers of fast and slow song excerpts in the two experimental conditions. Character to lyrics length Familiar songs. Unfamiliar songs. 35 excerpts. 35 excerpts. 25 excerpts. 25 excerpts. ratio 1 character per second (i.e. slow songs) 2 characters per second (i.e. fast songs). A semantic priming paradigm was adopted in this experiment, with the final nouns of the lyrics stimuli being the prime words (e.g. “脾氣” in “相愛沒有那麼容易，每個人有他的脾氣。”). For the target words, two different conditions were constructed in relation to the sentence-final noun: 1) semantically related and 2) semantically unrelated. For the semantically-related targets, a pilot test was carried out to ask subjects to come up with the first two words they could think of that were most related to the prime words provided. The 120 prime words were divided into two questionnaires with 60 words each, and 20 Chinese 33.

(46) native speakers were recruited for each questionnaire. With the answers given by the subjects, 120 words related to the primes were decided. Another 120 words not given by the subjects as responses were constructed by the experimenter as unrelated targets. To ensure that word frequency was not a confound, the word frequency of the 120 semantically-related and 120 semantically-unrelated target words were examined by referencing to the Corpus of Chinese Word Frequency (http://elearning.ling.sinica.edu.tw/CWordfreq.html) constructed by Institute of Linguistics, Academia Sinica. Statistical analysis showed no significant difference in word frequency with regard to the factors of familiarity and relatedness (no main effect of familiarity, F(1,59)=.20, p=.657; no main effect of relatedness, F(1,59)=.006, p=.937; no familiarity x relatedness interaction, F(1,59)=1.50, p=.225). See Table 2 for the example materials for the prime-target paradigm, Table 3 for the exact numbers of materials in each experimental condition and Appendix II for the 240 target words paired with the primes and their word frequency.. 34.

(47) Table 2. Example materials in the experiment. Target. Condition Lyrics. Prime. (Familiarity). Related. Unrelated. 脾氣. 怒氣. 鋼筆. 寶貝. 小孩. 經費. 相愛沒有那麼容易每個 Familiar 人有他的脾氣也許後年也許永遠你 Unfamiliar 依然是我的寶貝. Table 3. The numbers of the experimental materials in each experimental condition. Condition. Prime number. Familiar. Target. Target number. Related. 30. 60 song lyrics. Trial number. 60 Unrelated. 30 120. Related. Unfamiliar. 30. 60 song lyrics. 60 Unrelated. 30. To ensure the difference in relatedness between the two types of target word conditions, another pilot test on word pair relatedness rating was conducted (e.g. 脾氣-怒氣, 脾氣-鋼筆). The 240 prime-target word pairs were divided into two questionnaires and randomized to avoid the same primes appearing in one questionnaire. Twenty Chinese native speakers were recruited for each questionnaire and were asked to rate the relatedness of the 35.

(48) word pairs on a 5-point scale (5: strongly related, 1: weakly related). The two-way ANOVA showed that there was a main effect of relatedness between the prime-related target and prime-unrelated target pairs, F(1,59)=2323.05, p<.001. There was no main effect of familiarity, F(1,59)=2.39, p=.127 and familiarity x relatedness interaction, F(1,59)=.62, p=.432. In sum, the experimental materials were carefully controlled for song familiarity (familiar vs. unfamiliar), cloze probability in the unfamiliar lyrics (the overall cloze probability: 31%), character number of the lyrics (i.e. length of the presented lyrics), character to lyrics length ratio (i.e. speed of the song), relatedness in the prime-target pairs, and word frequency of the targets. Thus, the materials were ensured not to be biased by the possible confounds as the factors illustrated above (e.g. song familiarity, cloze probability and speed of the song, etc.). See Table 4 & 5 for the summary of the statistical results for all the pilot tests.. 36.

(49) Table 4. The summary of the statistical results for the pilot tests: song lyrics materials. Factor. Comparison. Mean (S.D.). t(59). p. Song familiarity. Familiar vs. Unfamiliar. 2.14 (2.46). 6.73. .000*. Familiar vs. Unfamiliar. 1.167(4.937). 1.83. .072. Familiar vs. Unfamiliar. -.100(.511). -1.51. .135. Character number of the lyrics Character to lyrics length ratio Note. p*=<.001. S.D. is indicated in parenthesis. Table 5. The summary of the statistical results for the pilot tests: targets. Dependent variable. Word frequency. Relatedness ratings. Factors. Familiarity, Relatedness. Familiarity, Relatedness. F(1,59). p. Familiarity=.20. .657. Relatedness=.006. .937. Familiarity x Relatedness=1.50. .225. Familiarity=2.39. .127. Relatedness=2323.05. .000*. Familiarity x Relatedness=.62. .432. Note. p*=<.001. Four versions of the experimental trials were constructed so that the 120 excerpts of. 37.

(50) lyrics. stimuli. were. counterbalanced. across. the. semantically-related. and. semantically-unrelated target conditions. This way, each subject only heard each excerpt once, with the prime word pairing with either the semantically related or the unrelated target. Also, the chance for each prime to pair with the semantically related or unrelated targets was equal (see Table 6). All the 120 excerpts of lyrics stimuli were made with Audacity 2.0, an audio editor and recorder, each lasting from 5 to 18 seconds, and the target words were presented with the E-Prime 2.0 software from Psychology Software Tools, Inc... Table 6. The summary on the four versions of trials with lyrics stimuli counterbalanced across the related and unrelated conditions. Condition Excerpt. Version A. Version B. Version C. Version D. #1-30. Related. Unrelated. Related. Unrelated. #31-60. Unrelated. Related. Unrelated. Related. #1-30. Related. Unrelated. Unrelated. Related. #31-60. Unrelated. Related. Related. Unrelated. (Familiarity). Familiar. Unfamiliar. 3.3 Procedure The experiment was conducted in a sound-attenuated chamber to ensure the perception of auditory stimuli. The subjects were instructed to sit in front of a computer screen used for 38.

(51) presenting stimuli, and the distance between the subjects' eyes and the computer screen was about 90 cm. In each trial, the subjects were first presented aurally with a short excerpt of lyrics displayed by a speaker, followed by a visually presented target word on the computer screen. When listening to the auditory stimuli, the subjects were instructed to look at an asterisk sign "*" on the screen to reduce the eye movements that the subjects might make. After the auditory stimuli, a target word appeared on the screen. Subjects was asked to perform a word concreteness task on the target word; i.e., they needed to decide whether the presented word was a concrete or an abstract noun by pushing a button (1: concrete, 2: abstract) on a response box. Subjects' response hands were counterbalanced that half of them used their right hand and half of them used their left hand to press the response button. Subjects were reminded not to blink on seeing the target words. The ISI between the end of the auditory stimulus and the target was 450 ms. The subjects’ response to the word concreteness task was self-paced, but an upper limit of 3 sec was set so that the experiment could continue automatically. After the response to the word concreteness task, there was a 2-sec interval presented in blank and then a central fixation point (a plus sign, "+") with a 500-ms duration appeared, signaling the next trial. For the first trial, a 2-sec interval appeared before the first 500-ms fixation point and auditory stimulus to have the subjects prepared for the experiment. See Figure 1 for the experimental procedure of stimuli presentation and Appendix III for the experimental instruction. 39.

(52) Figure 1. Procedure of stimuli presentation. Each trial began with a central fixation point, and the subjects were aurally presented an auditory stimulus and later visually presented a target word with a 450-ms ISI. On seeing the target word, the subjects needed to perform a self-paced word concreteness task to judge if the target word is a concrete or an abstract noun. After the response to the word concreteness task, a 2-sec blank appeared, followed by a central fixation point indicating the upcoming trial. (This figure exemplifies the first two consecutive trials.). Before the formal experiment, the subjects were given 9 practice trials to acquaint them with the experimental procedure. During the experiment, a sign indicating a short break popped up on the screen every 30 trials to avoid subjects’ fatigue. The subjects could press any button to continue the experiment when they were ready. Not including the time for the breaks, the length of the whole experiment was about 20 minutes. After finishing the experiment, the subjects were given a questionnaire surveying if they were familiar with the songs used in the experiment with all the song excerpts displayed again. 40.