• 沒有找到結果。

Chapter 1 Introduction

1.2 The Proposed Model

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

2

tone with lexical access failure.

The outputs of the language-to-music mapping and the music-to-language mapping are governed by perception and production grammars, which are schematized in section 1.2.

1.2 The Proposed Model

This research proposes a model for the language-to-music mapping, and that for the music-to-language mapping.

The model in (1) predicts the language-to-music mapping, where segmental changes and rhythmic mappings are accounted. The linguistic input maps to the output through the relevant production grammar. The linguistic output then maps to the musical input by the perception grammar. Finally, the musical input maps to the output by the production grammar.

(1) Language-to-music mapping: segment and rhythm

Linguistic input

Production grammar Linguistic output

Perception grammar

Musical input

Production grammar

Musical output

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

3

The model in (2) predicts the music-to-language mapping, where tonal changes are accounted. The musical output maps to the linguistic input by the perception grammar, and then the linguistic input maps to the output by the production grammar.

(2) Music-to-language mapping: tone Musical output

Perception grammar Linguistic input

Production grammar

Linguistic output

This research provides an extensive study on the connection between language and music by positing an independent perception grammar. The independence of the perception grammar is revealed from segmental changes, rhythmic correspondences, and tonal adjustments. The language-music mappings are analyzed under the framework of Optimality Theory (Prince and Smolensky 1993/2004). Different constraints and rankings are respectively posited in the production and perception grammars.

1.3 Organization

This dissertation is composed of nine chapters. Chapter 1 introduces the core research issue, the theoretical proposal, and the organization of this dissertation.

Chapter 2 discusses previous studies on the language-music connection, perception, prosodic phonology, classic OT, and Stratal OT. Both Chapter 3 and Chapter 4 compare the linguistic mapping and the language-to-music mapping through Mandarin-accented

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

4

English in reading and singing. Chapter 3 discusses how musical beat assignment affects the segmental changes in relation to the onset clusters and how the prosodic word conditions musical beat assignment in the musical input. Chapter 4 continues to discuss the Mandarin-accented English in reading and singing, with a focus on coda clusters. Chapters 5 and 6 examine rhythmic correspondences in the composing of Mandarin children’s songs. Chapter 5 investigates the mapping from foot structure to musical structure while Chapter 6 continues to probe into the mapping between intonational phrase (IP) and musical structure. Chapter 7 investigates children’s perception of the musical melody in the singing of Mandarin songs. Chapter 8 continues to examine children’s perception of tones in Mandarin songs and the music-to-language mapping that produces unassociated terms. Chapter 9 offers the conclusion and proposes the remaining issue for further studies.

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

5

Chapter 2

Literature Review

This chapter introduces the current researches of music-language connection as well as the theoretical background of perception grammar, prosodic phonology, and Optimality Theory (Prince and Smolensky 1993/2004).

2.1 Music-Language Connection

Previous studies demonstrate the parallel feature between language and music.

Lindblom (1978) proposes that lengthening of final elements is not only shown speech but also in music. Lerdahl and Jackendoff (1983) observe the similarity between metrical beat in linguistic rhythm and musical beat in musical rhythm. Sunberg and Lindblom (1991) propose that both language and musical structures are structured hierarchically and can be parsed into smaller sections. Schreuder (2006) also explores the resemblance between linguistic rhythm and musical rhythm.

Studies show the reconciliation between language and music in structure and stress mapping. Halle and Lerdahl (1993) discover that when singers encounter novel stanza for a song they know, they have the consistent ability to set the stanza into the song.

For example, singers tend to match stressed syllables to strong positions in music.

Following the study of Halle and Lerdahl (1993), Hayes (2005) investigates the textsetting intuition under the framework of OT, in which constraints can be violated for a more important purpose. For instance, stressed syllables are placed in weaker rhythmic positions in order to avoid long lapse, where long sequence is without syllable.

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

6

The linguistic and metrical mappings also show various rhythmic correspondences.

Selkirk (1984) proposes the silent grid positions, which may correspond to pausing or syllable lengthening. Hsiao (2006, 2007) observes the silent demibeat in Taiwanese nursery rhymes and Changhua folk verse. Metrical lines with silent demibeat in the final position are regarded as the masculine lines. In the Changhua folk verse in (3), the final demibeat is silent so it is regarded as a masculine line.

(3) X x X x X x X x o tsiao peh tsiao lai thao tsia black bird white bird come secretly eat

‘Black birds and white birds come to steal food.’

(Hsiao 2006: 10)

Hsiao (2006) also observes that immediate constituents (ICs) share a demibeat to create masculine lines. As in (4), the pair of ICs tshut-lai, share one demibeat.

(4) X x X x tsao tshut-lai khuanN Run DIR-DIR look (Hsiao 2006:17)

On the other hand, Huang (2007) builds a corpus and examines the alignment of prosodic structure and the movement of the finger rhymes. Sung (2012) also investigates the structure alignment, and syllable-to-musical beat mapping between Chinese verse line and music. For example, the last syllable in a stanza corresponds to the longest beat.

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

7

Linguistic mora is found to be influenced by music. Ito, Kubozono, Mester &

Tanaka (2019) examine the rhythmic adaptation of batters’ names into baseball chants.

Baseball fans set batters’ names into three beats (X). The base chant shows structural mapping between rhythm and the linguistic mora. For example, the mapping principle for the 3-mora names requires aligning the initial mora to the initial beat (X1), the final mora to the final beat (X3), and the medial mora to the medial beat (X2). Therefore,

ba-a-su surfaces as baa-aba-a-suu instead of *baba-a-suu-uu.

For musical pitch and linguistic tonal mapping, Wong and Diehl (2002) investigate Cantonese songs and discuss how the lyrics of a song in a tone language are understood.

Wee (2007) also discusses how listeners of Mandarin songs identify the lyrics from the musical melodies. Wee (2007) proposes that, when preserving contrast between musical heads and linguistic heads, which are at prominent positions, listeners are able to reconstruct lyrics. The tone-tune correspondence is also observed in Shona, a Bantu language spoken in Zimbabwe. Schellenberg (2009) finds that sung melodies in Shona correspond to the spoken melodies.

Previous studies discuss the language-music mapping from the perspective of either perception or production grammar. The present study proposes that both perception grammar and production grammar are involved in the connection between language and music. The segmental change according to beat duration is examined in Chapter 3-4. The alignment between prosodic structures and the musical structures is examined through Mandarin children’s songs in Chapter 5-6. The pitch-tone correspondence is discussed in Chapter 7-8.

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

8

2.2 Perception Grammar

Previous researches have argued over the independence of perception grammar.

Studies of loanword adaptation demonstrate that perception and production together contribute to the output form of loanwords. Silverman (1992) proposes that there are two stages in the adoption of loanwords. The first stage is the perceptual scan where some, but not all of the aspects are detected. For example, when Cantonese speakers perceive English words, they do not perceive English voicing contrast, which Cantonese lacks. The output of the perceptual scan becomes the input for Operative Level. Among the detected segments, more salient ones tend to be preserved in the output. For instance, in the English word, place, /s/ is more salient than /l/, so the Cantonese output is [pheysi], which deletes /l/, whereas /s/ is preserved.

Yip (1993) follows the work of Silverman (1992) and proposes the constraint based-analysis of the Cantonese loanwords. Yip (1993) proposes that Cantonese loanwords are close to the perceived input. On the other hand, the output of the Cantonese loanword phonology must conform to surface well-formedness. The violation of faithfulness is for minimally bi-syllabic outputs and for preserving highly salient segments by vowel insertion.

Kenstowicz (2003) reviews Gbeto’s (1999) cross-linguistic loanword in Fon. In the review, he proposes separate constraint rankings for perception and production grammar for French loanwords in Fon. In the perception process, word-final stops that are preceded by obstruents are diminished. Therefore, deletion or epenthesis would take place. This motivates a separate perception mapping in loanword adaptation. Take post for example. The constraint ranking for perception is Dep-V >> *stop/obstruent #

>> Max-C, which selects pos as the optimal output. On the other hand, the constraint ranking for production grammar is Max-C >> *stop/obstruent # >> Dep-V, which

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

9

selects posu as the output.

Boersma (2001) argues that the production of a word, involves not only perception and production grammar, but also recognition grammar. Boersma (2001) proposes a grammar model that illustrates the process of perception, production, and recognition grammar.

(5) The grammar model of functional phonology

(Boersma 2001:24)

The left side of the figure in (5) shows that the listeners perceive other speaker’s utterance and keep it as the underlying form after lexical recognition. The left side of the figure comprises the comprehension grammar which contains both perception and recognition grammar. The right-hand side of the figure shows how the listeners produce the sound they perceive.

On the contrary, Smolensky (1996) proposes a single grammar that works for production and comprehension. This is fought against by Boersma (2001), who indicates that only the maximally faithful candidate will win in comprehension grammar, which is not always true.

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

10

(6) *VOICEDCODA >>MAXVOI

∣rɑd∣ ‘wheel’ *VOICEDCODA MAXVOI

a. [rɑd] *!

b. [rɑt] *

The coda in (6) is voiced so it is eliminated by *VOICEDCODA.The production grammar chooses [rɑt] as the listener’s optimal output in sacrifice of MAXVOI, which requires that underlying voicing feature have an output correspondent.

(7) *VOICEDCODA >>MAXVOI

[rɑt] *VOICEDCODA MAXVOI

**a. ∣rɑt∣‘rat’

b. ∣rɑd∣‘wheel’ *!

The constraints in (7) evaluate the top left cell, so each candidate does not violate

*VOICEDCODA.The constraint ranking in (7) which is identical to that in (6) will always choose ∣rɑt∣as the underlying form of the listener even though the speaker may refer to ‘wheel.’Boersma (2001) solves this problem by capturing the phonology-semantics interaction. In terms of the fact that words with lower frequency are less likely to be recognized, he proposes the constraint, *LEX, which evaluates the underlying form and can rule out words with lower frequency.

(8) * LEX (∣rɑd∣‘wheel’)>>*VOICEDCODA >>MAXVOI >>* LEX >>(∣rɑt∣‘rat’)

[rɑt] * LEX

(∣rɑd∣‘wheel’)

*VOICEDCODA MAXVOI * LEX

(∣rɑt∣‘rat’)

a. ∣rɑt∣‘rat’ *

b.∣ rɑd∣‘wheel’ *! *

As in (8), * LEX (∣rɑd∣‘wheel’)can be lowered during acquisition.

This study also proposes that perception and production are separate grammars.

The evidence is shown in music-to-language mapping and language-to-music mapping.

For example, when perceiving song lyrics, children perceive the musical pitch faithfully into the linguistic tone. However, when producing the lyrics, their production form will be influence by lexical association and surface well-formedness constraints. While lexicon recognition in Boersma (2001) takes place in recognition grammar, this study proposes that lexical association shows in the production grammar. More discussions will be shown in Chapter 7-8.

2.3 Prosodic Phonology

2.3.1 The Prosodic Hierarchy

The prosodic hierarchy is proposed by Selkirk (1980), Nespor and Vogel (1986), and Inkelas (1989), among others. The prosodic hierarchy divides phonological structures into smaller constituents, as in (9).

(9) Prosodic Hierarchy

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

12

The prosodic hierarchy is subject to the strict layering hypothesis (Selkirk 1984;

Nespor and Vogel 1986).

(10) Strict layering hypothesis

(a) *skipping (b) *inverting (c) *recursive

IP Wd

Ph IP

IP IP

(10a) shows the violation of *skipping since IP directly dominates Wd, and skips the Ph level. The violation of *inverting is illustrated in (10b) where Ph is at a lower level than IP. (10c) violates *recursion, which prohibits prosodic structures from dominating themselves.

The present research discusses how the prosodic structures, which are IP, foot and prosodic word are aligned with musical beats and structures.

2.3.2 Intonational Phrase

Nespor and Vogel (1986) presume that the entire sentence is a single intonational phrase (IP), which can be restructured for physiological reasons or for ease of language processing. Nespor and Vogel (1986) also propose that a line that is too long is not preferred, so IP can be broken down into shorter ones, as in (11b-c).

(11)

(a) (My friend’s baby hamster always looks for food in the corners of its cage)IP

(b) (My friend’s baby hamster)IP (always looks for food in the corners of its cage)IP

(c) (My friend’s baby hamster)IP (always looks for food)IP (in the corners of its cage)IP

(Nespor and Vogel 1986:194)

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

13

The domain of an intonational phrase (IP) can be syntactically or semantically defined. Some studies indicate that IPs are formed based on the syntactic configurations (Downing 1970, 1973; Bing 1979). Halliday (1967) and Selkirk (1984) recognize that IPs are semantically based. Consider the Sense Unit Condition proposed by Selkirk (1984) in (12).

(12) Sense Unit Condition

Two constituents Ci, Cj form a sense unit if (a) or (b) is true of the semantic interpretation of the sentence:

(a) Ci modifies Cj (a head),

(b) Ci is an argument of Cj (a head).

(Selkirk 1984:291)

A sense unit is formed by modifier-modified relation and the head-argument relation between syntactic constituents. Consider the example in (13).

(13)

(a) ‘Give you a big apple.’

VP

V' NP1

song ni da ping guo 送 你 大 蘋 果 give 2SG big apple IP

unit. At the lower level of the syntactic tree, ni ‘you’ is the indirect object of V song

‘give’, and the AP da ‘big’ is a modifier of the N ping guo, ‘apple’. Therefore, song ni, and da ping guo can be respectively parsed into intonational phrases, as in (13b).

The prosodic pause, which is among the factors that contribute to the parsing of intonational phrase, is illustrated in (14).

(14)

The grid exhibits units of conceived time, which is termed demibeat by Selkirk (1984).

As shown in (14), x-symbols stand for silent demibeat positions, which are regarded as pauses when they are unaligned. On the other hand, syllable-lengthening is represented by aligning x-symbols with syllables. Selkirk (1984) indicates that pausing and syllable-lengthening are actually the same phenomena.

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

15

The silent demibeat addition rule is proposed by Selkirk (1984), as in (15).

(15) Silent demibeat addition

Add a silent demibeat at the end (right extreme) of the metrical grid aligned with (a) a word,

(b) a word that is the head of a nonadjunct constituent, (c) a phrase,

(d) a daughter phrase of S.

(Selkirk 1984)

In them of the fact that neither of sense unit and prosodic pause could by itself define the intonational phrase, Hsiao (1995) provides the parameter of intonational phrase, as in (16).

(16) I-Parameter

I=< γ ]。, SU > where γ = boundary tone, 。 = pause

] = right edge, SU = sense unit

The parameter exhibits that an intonational phrase is a sense unit that ends in a boundary tone followed by a pause. Hsiao (1995) indicates that a silent beat is obligatorily added to the end of an IP in a normal tempo.

In Chapter 5-6 of the present study, the composer composes musical melody based on given lyrics. The intonational phrase boundaries are thus defined by the provided punctuations such as commas, periods, or exclamations.

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

16

2.3.3 Foot Formation Rule

Chen (1984) proposes the foot formation rules to account for Mandarin verses.

The rules take syntactic information into account and operate in the order in (17):

(17) Foot formation rule

(a) Immediate Constituency (IC): Link immediate constituents into disyllabic feet.

(b) Duple Meter (DM): Scanning from left to right, string together unpaired syllables into binary feet.

(c) Triple Meter (TM): Join any leftover monosyllable to a neighboring binary foot according to the direction of syntactic branching.

(Chen 1984:223)

The main focuses of the rules in (17) are ICs and the tree branching direction of the syntactic tree.

(18) ‘Fishermen’s nets gather under the cold pond.’

As exemplified in (18), ICs, yu and ren, han and tan have the priority to form into two feet. DM scans from left to right and strings wang and ji into a foot. As shown in (18), the branching of wang and ji is in the opposite direction. However, they can still be strung into one foot. Then TM parses xia to the neighboring foot, han tan.

yu- ren wang ji han tan xia 漁 人 網 集 寒 潭 下 fisherman net gather cold pond under

∣ ∣f ∣ ∣f IC ∣ ∣f DM

∣ ∣f TM

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

17

Shih (1986) proposes the modified foot formation rule for Mandarin common speech, as in (19).

(19) Foot formation rule

(a) Immediate Constituency (IC):

Link immediate constituents into disyllabic feet.

(b) Duple Meter (DM):

Scanning from left to right, string together unpaired syllables into binary feet, unless they branch in the opposite direction.

(c) Superfoot (f’): Join any leftover monosyllable to a neighboring binary foot according to the direction of syntactic branching.

(Shih, 1986: 110)

In Shih’s (1986) foot formation rule, DM cannot string syllables that belong to different branching direction.

(20) ‘In the small bowl is where the fruit is placed’

xiao wan li bai shui guo 小 碗 裡 擺 水 果 small bowl in put fruit

∣ ∣f ∣ ∣f IC

∣ ∣f *DM ∣ ∣f ∣ ∣f Superfoot

(20) shows that li and bai cannot form a DM since they have opposite branching direction. Therefore, xiao wan li and bai shui guo respectively forms two superfeet.

Based on Chen (1984) and Shih (1986), Hsiao (1991) proposes the beat counting

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

18

device in terms of the discrepancy between lexical syllables and functor syllables. The metrical beat is assigned with a lexical syllable first and then the functor syllable is assigned with a beat in normal or slow speech, behaving like a lexical syllable and is left-adjoined to the nearest beat in fast speech. Hsiao (1991) proposes the following foot formation rule that is on the basis of beat counting device.

(21) Foot formation revisited

(a) Immediate Constituent Foot (ICF): Any adjacent beats which are assigned to ICs form an ICF.

(b) Adjacent Beat Foot (ABF): Any two adjacent beats which are not assigned to ICs are paired into an ABF.

(c) Jumbo Foot (JF): Any unpaired single beat is recruited by a neighboring foot to form a Jumbo Foot if the beat c-commands the adjacent beat contained in the foot.

(d) Minifoot (MF): The leftmost single beat constitutes a Minifoot iff it is followed by an intonational phrase boundary %.

(Hsiao 1991:38)

Hsiao’s (1991) foot formation rule is exemplified in (22).

(22) ‘I went toward the north.’

wo wang bei zou 我 往 北 走 I toward north go

x x Lexical Beat

∣ ∣f ABF

x x Functor Beat

∣ ∣f ABF

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

19

As shown in (22), bei and zou are assigned lexical beats, whereas wo and wang are assigned functor beats. The lexical beats, bei and zou are paired into an ABF first. Then the functor beats, wo and wang are parsed into another ABF.

The present study applies Shih’s (1986) and Hsiao’s (1991) foot formation rules for constructing foot in Mandarin children’s songs. The lyrics of the children’s songs are similar to common speech. Take (23) for example.

(23) ‘Love to somersault when nothing to do whole day long.’

ICs, zheng-tian, mei-shi, and gen-tou are first parsed into feet. Then ai and fan are strung into a foot since their branching directions are the same.

Example (24) is another example taken from the present study.

zheng tian mei shi ai fan gen tou

整 天 沒 事 愛 翻 跟 頭 whole-day nothing-to-do love turn somersault

∣ ∣f ∣ ∣f ∣ ∣f

∣ ∣f

(24) ‘There is a big caterpillar in a big apple.’

First of all, ICs, ping-guo and mao-mao are strung into feet. Then you and da are strung together since their branching direction is the same. Finally, da and li are both adjoined to the foot ping-guo while chung is adjoined to the foot mao-mao.

2.4 Optimality Theory

The Optimality Theory (OT) is proposed by Prince and Smolensky (1993/2004).

OT regards grammars as a set of ranked constraints which are violable. The operation of OT consists mainly of Generator (GEN) and Evaluator (EVAL).

(25) Mapping of input to output in OT grammar

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

21

The figure in (25) is illustrated by Kager (1999). In (25), GEN generates infinite sets of output candidates, which are evaluated by a set of hierarchically ranked constraints (C1>> C2 >> C3…). Each constraint may eliminate some output candidates until only one output candidate survives.

2.4.1 Faithfulness Constraints and Markedness Constraints

The constraints in Optimality Theory (Prince and Smolensky 1993/2004) are universal. What makes language different from each other is the different rankings of the constraints. OT mainly contains two kinds of constraints, which are faithfulness constraints and markedness constraints.

The correspondence theory provides the framework for defining faithfulness constraints (McCarthy and Prince 1995, 1999). The concept is that each candidate generated by GEN includes an output representation and a relation between the input and the output. The faithfulness constraints of correspondence relation are shown in (26-28).

(26) MAX (No deletion):

Let input = i1

i

2

i

3

…i

n and output = o1

o

2

o

3

…o

m

Assign one violation mark for every ix

if there is no oy

where i

x

 o

y

if there is no oy

where i

x

 o

y