CHAPTER 1 INTRODUCTION
1.4 Thesis Organizations
國
立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
influenced by the requirement of a specific tone. Under the derivational tradition, this interaction entails serial operations of metrical system and tonal system. The question is then how the bi-directional interaction between tone and stress can be categorized on a non-derivational basis. For this purpose, this thesis employs a perspective from the Optimality Theory (abbreviated as OT, Prince & Smolensky 1993/2004), which requires that all output candidates be evaluated in parallel. In the present analysis, I will posit a set of metrical constraints and a set of tonal constraints to capture the bi- directional interaction, by which the effect of the conditions in (2) is also attained.
1.4 Thesis organization
This thesis consists of five chapters. The first chapter presents the motivation and research issues behind the current study, also laying out the research questions with a sketch of the major proposal. Chapter 2 reviews some relevant theoretical frameworks, and looks at the previous analyses of Shanghai tone sandhi. Chapters 3 and 4 present the analyses of TSC. Chapter 3 discusses the word-medial stress in TSYI by means of comparison with the word-initial stress in TSS. Chapter 4 addresses the long-distance tone movement and the concomitant word-final stress in TSYA, where a minor pro- cess of contour extension is also under discussion. Chapter 5 provides the concluding remarks.
‧ 國
立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
CHAPTER 2
THEORETICAL AND TONAL BACKGROUND
In this chapter we are first to review relevant theoretical background, including Optimality Theory (section 2.1), metrical phonology (section 2.3), and positional pro- minence within the field of tone mapping (section 2.4). The reviewal will then move on to the previous analyses on Shanghai tonal phonology (sections 2.5), where both Autosegmental approaches and Optimality-theoretic approaches will be scrutinized.
2.1 Optimality Theory
The parts of this section are organized around several components of Optimality Theory (henceforth OT, Prince & Smolensky 1993/2004, McCarthy & Prince 1993a, 1993b, 1995, 1999, inter alia). The first part sets up the basic architecture of OT, and the parts that follow focus on some constraint schemata. In addition, variation in OT is also discussed in this section.
2.1.1 Basics
The fundamental notion of OT forsakes the derivational convention in generative grammar in which the context-driven rewrite rules predominate. OT instead advocates parallelism, which means that all possible ultimate outputs are contemplated at once.
As a result, the effects of diverse phonological processes are present simultaneously.
OT considers that Universal Grammar (abbreviated as UG) includes a constraint component CON that contains the entire repertoire of violable, rankable and well- motivated constraints. Every constraint in CON is in the grammar of every language, with the ranking of constraints with respect to one another determined on a language- specific basis. These hypotheses follow from the more general assumption that cons-
‧ 國
立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
traint ranking is the only way that languages systematically differ.
Constraints in CON are categorized into two competing forces: markedness and faithfulness. Markedness constraints impinge on the structural well-formedness of the output. The other family, faithfulness constraints, governs the input-output correspon- dence. For a full discussion of correspondence theory, see McCarthy & Prince (1995b, 1999).
The schema in (1) elucidates the architecture of OT.
(1) Schema of OT
As seen, OT, conspicuously, inherits the input-output or underlying-surface relations from generative phonology. Additionally, this input-output mapping does not proceed in step-by-step fashion. No serious of the rule application is involved. Rather, an input pertains to an infinite number of possible output candidates via GEN. The candidates are submitted in parallel to constraints in CON for EVAL. A higher-ranking constraint can compel the violation of a lower-ranking one; nonetheless, the violation is always minimal, so no constraint is violated more than is absolutely necessary to satisfy the constraints that dominate it in the hierarchy. The candidate that violates the lowest- ranked constraint or does not incur any violation is selected by EVAL as the optimal output. Should there be any output candidate that violates a constraint, the one that incurs the fewest violation-marks of the constraint wins out or is passed down to the next lower-ranked constraint for evaluation.
GEGENN EVEVAALL
‧ 國
立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
2.1.2 Alignment, Anchoring and Coincidence
The original idea of Generalized Alignment was defined within the PARSE/FILL/ Containment-based model of Prince & Smolensky (1991, 1993), which posits a single output representation containing information about underlying morphological struc- ture and surface prosodic structure. Generalized Alignment requires a coincidence of the edges of prosodic and/or morphological constituents within the output structure.
The schema of Generalized Alignment from McCarthy and Prince (1993a) is given below in (2).
(2) Generalized Alignment (McCarthy and Prince 1993a:2) Align(Cat1, Edge1, Cat2, Edge2) =def
∀Cat1∃Cat2 such that Edge1 of Cat1 and Edge2 of Cat2 coincide.
Where
Cat1, Cat2 ∈ PCat ∪ GCat Edge1, Edge2 ∈{Right, Left}
Conceptually developed from the edge coincidence of Alignment, Anchoring was originally introduced by McCarthy and Prince (1993a) as a family of reduplication- specific constraints that require base-initial (or final) segments to have initial (or final) correspondents in the reduplicant – the two strings must be anchored at an edge. With the development of correspondence theory, which allows direct reference to the input (or other related representation), McCarthy & Prince (1995b) point out that some of phenomena originally attributed to Alignment constraints, particularly the faithfulness to the edge-most position of a correspondent segment, should be in fact understood as Anchoring effects. From then on Anchoring has been generally used to capture the special degree of faithfulness accorded to designated edges, both in the IO-domain as
‧ 國
立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
well as in BR-relations. The general schema for Anchoring is given in (3).
(3) Anchoring (McCarthy & Prince 1995a):
{RIGHT,LEFT}-ANCHOR(S1, S2)
Any element at the designated periphery of S1 has a correspondent at the designated periphery of S2.
Let Edge(X, {L, R}) = the element standing at the Edge = L, R of X.
RIGHT-ANCHOR. If x = Edge(S1, R) and y = Edge(S2, R) then xRy.
LEFT-ANCHOR. Likewise, mutatis mutandis.
Based on (3), Anchoring constraints have the general form ANCHOR(Cat1, Cat2, E) where Cat1, Cat2 range over morphological categories (root, affix word, etc.) and prosodic categories (syllable, foot, PrWd, etc.), and Edge E may be left edge or right edge.
Another constraint family founded on Alignment is Coincidence introduced by Zoll (1996). With conjoining the edge coincidence of Generalized Alignment and a markedness constraint, Zoll defines Coincidence as a family of licensing constraints that dictates the coincidence of the marked structure in question with a prosodically strong constituent. The general formulation of Coincidence is given in (4).
(4) COINCIDE (marked structure, strong constituent) (Zoll 1996:147) (i) x (x is marked y(y=strong constituent Coincide (x,y)) (ii) Assess one mark for each value of x for which (i) is false
COINCIDE (x,y) will be true if (i) y=x; (ii) y dominates x; or (iii) x dominates y, where x stands for marked structures that need licensing at some specific position and y stands for prosodically strong constituents that serve as the qualified licensor. Some
‧ 國
立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
constituents, such as accented/stressed syllables and long vowels, may be considered strong independently of their location, but others gain prominence only by dint of their peripheral position. To pick out these peripheral constituents the function {L,R}- most(P, Q) is used to designates any prosodic constituent (from the prosodic hierarchy) at a designated edge, as in (5).
(5) Prosodic constituents at designated edges (Zoll 1996:149) Let {L,R}-most(P,Q) = the {L,R}-most P in Q, where P,Q are prosodic constituents
Then: Rightmost(P,Q) = the rightmost P in Q Leftmost(P,Q) = the leftmost P in Q
According to (5), a prosodically strong constituent y in COINCIDE(x,y) may refer to a designated edge, which means that the effect of Coincidence constraints involve the notion of edge coincidence as does Generalized Alignment. Nevertheless, Zoll points out that Coincidence constraints crucially differ from Alignment ones in two regards:
(a) intrinsically, Coincidence refers to a coincidence of constituents, not of edges; (b) unlike Alignment constraints, Coincidence does not distinguish different degrees of misalignmnent. These differences, as Zoll has demonstrated, make Coincidence fare better than Alignment in accounting for licensing phenomena.
2.1.3 Local (Self-)conjunction
According to Smolensky (cf. 1995, 1997, 2006), every constraint in CON can be conjoined with another constraint, or with itself, to produce a new constraint. This operation provides a rationale for constraints that exclude “the worst of the worst.” A formulation is given here for defining the conjunction of different/identical constraints, as shown in (6), which is adapted from Itô & Mester (1998:10).
‧ 國
立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
(6) Local conjunction of constraints (LCC) a. Definition
Given a domain δ and two constraints A and B that can be evaluated over the domain δ, the local conjunction of A and B relative to δ is denoted as [A&B]δ.
Let A and B be members of the constraint set CON; then [A&B]δ is also a member of CON.
b. Interpretation
[A&B]δ is violated if (and only if) there are distinct violations of A and B in a single domain δ.
c. Ranking (universal) [A&B]δ » A
[A&B]δ » B
Based on this formulation, if the two constraints conjoined are different, namely A = Cons1 and B = Cons2, then a locally-conjoined constraint [Cons1&Cons2]δ is derived, which is violated once by any instance of δ that contains a distinct violation of Cons1 and a distinct violation of Cons2. By contrast, if we are conjoining a constraint with itself, so that A = B = Cons1, then the self-conjunction [Cons1&Cons1]δ – normally written more simply as [Cons1]2 – is violated once by every pair of distinct violations of Cons1 in a single domain δ.
In sum, local (self-)conjunction [A&B]δ permits violations of A and B, as long as the violation of A does not co-occur with the violation of B in a single domain δ. It follows that a locally-conjoined constraint is less stringent in assessing violations than the individual constraints that make up the local conjunction. Given two constraints in
‧ 國
立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
a stringency relationship, the less stringent constraint is demonstrably ranked higher than the more stringent one (for an illustrative discussion see McCarthy 2008), hence A and B under the domination of [A&B]δ, as the universal ranking in (6c), which says that the violation of a locally-conjoined [A&B]δ is more fatal than the violation of A or B.
2.1.4 Variation in OT
In classical OT, there is no room for multiple outputs, due to the OT architecture, which selects only one candidate as optimal from a given input. Hence, phonological variation/optionality, which always involves multiple outputs, runs into problems with classical OT, and is one of the central issues for the OT criticism (Vaux 2002, 2006, Bermúdez-Otero & Börjars 2006).
To take account of variation/optionality, recent OT analyses have focused on the function of EVAL/ranking ordering of constraints to obtain the multiple outputs from a single underlying form. There have been various attempts to adapt the OT model in some way to explain free variation, including floating constraints (Nagy and Reynolds, 1997), partially ordered grammars (Anttila & Cho 1998, Anttila, 1997, 2002a), and strictness bands (Hayes, 2000), etc. One of the more successful models to date is the partially ordered model. Under this model, a grammar is defined as a partial order in a set of constraints. “A partial order is a binary relation (i.e. a set of ordered pairs) that is irreflexive, asymmetric, and transitive.” (Anttila 2007:527). By this new definition, given three constraints {A, B, C}, then {A » B} qualifies as a grammar, so does {A » B, B » C}. The generalized optimality-theoretical grammars are termed “Partially Or- dered Grammars” (abbreviated as POGs). A classical optimality-theoretical grammar is a POG where all the pairs are ordered, e.g., {A » B, B » C, A » C}. Under this view, a language L with internal variations is a POG where only some pairs are ordered (i.e.
‧
specified for ranking), and each of the variations in L shares these ordered pairs, with the other unordered ones specified variably. Intra-linguistic variations, hence, serve as the sub-grammars of L. The subset relation is formalized in the grammar lattice in (7).
(7) The formulation of a grammar lattice (Anttila 2002a)
Given three constraints, {A, B, C}, there are a total of six grammars that arise from ordering these constraints in different degrees. Each super-ordinate grammar has less ordered pairs than its subordinate grammar, which is manifested by the intersection of nodes. The partially ordered pairs on each grammar-node can be translated into a set of totally ranked constraints, which is placed in the braces. The more ordered pairs there are, the less variation there is. In consequence, sub-grammars (c), (d), and (e) are the totally ranked grammars that describe invariant dialects, while sub-grammars (a), (b), and the language L, contain intra-linguistic variations. This model provides a
Language L
‧ 國
立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
theoretical foundation for the re-ranking of constraints, by which OT can be accom- modated to the variation/optionality within a single language.
2.2 Metrical Phonology
The parts of this section shortly review several notions of metrical phonology, including bracketed grid, bounded/unbounded foot-parsing, quantity-sensitivity, and stress clash/lapse, all of which are available for the present analysis.
2.2.1 Bracketed Grid
The main assumption of metrical phonology is that stress is a relational property, represented by prominence relations between constituents in hierarchical structures (Liberman 1975, Liberman & Prince 1977, Hayes 1980). This assumption is presented by metrical grid (Liberman & Prince 1977, Prince 1983, Selkirk 1984), a succession of columns of grid elements of different height. Height of columns hints a syllable’s relative prominence. As an example, consider (8), the metrical grid of “Apalachicola”
[ˌæpəˌlæʧɪˈkoːlə]. Its grid analysis contains six columns, each standing over a syllable.
The first, third and fifth columns are taller than the second, fourth and sixth. The fifth column, indicating the culminating peak of the grid, is taller than the first and third.
(8) “Apalachicola” in metrical grid
PrWd-level x
Foot-level x x x
Syllable-level x x x x x x
ˌæ. pə. ˌlæ. ʧɪ. ˈkoː. lə.
The metrical grid can be combined with metrical constituency, which refers to group- ings of grid elements at low levels into higher-order elements. Metrical constituency is formally presented by bracketing grid elements by pairs of parentheses. (Hammond
‧ 國
立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
1984, Halle & Vergnaud 1987, Hayes 1995). Each constituent has an obligatory head, represented by a grid-mark at the next-higher level, plus an optional non-head, which has no corresponding mark at the next-higher level. By adding the constituency to the grid in (8), we obtain a bracketed representation in (9).
(9) “Apalachicola” in bracketed grid
PrWd-level x
Foot-level (x x x ) Syllable-level (x x) (x x) (x x)
ˌæ. pə. ˌlæ. ʧɪ. ˈkoː. lə.
At the syllable level, pairs of grid elements are bracketed together by parentheses into three “metrical feet:” (æ.pə), (læ.ʧɪ) and (koː.lə). Rhythmically strong syllables, called
“heads,” are initial in those feet, forming “trochaic rhythm.” Each foot projects its head by a mark at the foot level. Elements at the foot level are similarly bracketed together in a single foot with final head, forming “iambic rhythm.” This then projects a grid element at the prosodic-word level, the primary stress of the word.
Hayes (1995) uses a flattened representation of bracketed grid, which collapses three layers into two. Within each constituent, the head is represented by a grid-mark, the non-head by a dot, as shown below.
(10) “Apalachicola” in a flattened bracketed grid
x
(x .) (x .) (x .)
ˌæ. pə. ˌlæ. ʧɪ. ˈkoː. lə.
The flattened grid in (10) can be translated into single-layer representations, as exem- plified in (11), where dots indicate syllable boundaries, parentheses, foot boundaries, and square brackets, prosodic-word boundaries. Relative prominence is signified by
‧ 國
立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
IPA-style stress-marks before syllables.
(11) “Apalachicola” in a single-layer bracketed grid [(ˌæ.pə).(ˌlæ.ʧɪ).(ˈkoː.lə)]
We will primarily use this single-layer representation throughout this thesis, with the duple-layer one in (10) interchanged if necessary. Nonetheless, bear in mind that these simplified representations are based on the hierarchically bracketed grid in (9).
2.2.2 Foot-parsing
As seen in the bracketed grid, stressed syllables serve as the obligatory head of a metrical foot, which implies that the foot-parsing centers upon stress – where there is a stressed syllable, there is a foot that can be construed. The number and position of stress in a word (i.e. rhythmic patterns) varies among stress languages. On one end of the spectrum, there are systems which have multiple stresses in an alternating pattern, with the most prominent, or primary stress, being at or near an edge, and the others being less prominent, or secondary. Since there are multiple stresses, more than one foot is parsed, as shown in the bracketed analysis in (12). The feet contain a stressed syllable and no more than one unstressed syllable, termed “bounded feet.”
(12) Alternating stresses under the bounded foot-parsing a. [(ˈσσ)(ˌσσ)(ˌσσ)…]
b. […(ˌσσ)(ˌσσ)(ˈσσ)]
On the opposite end of the rhythmic spectrum, we find systems with only one stress at or near an edge in each word. It follows that only a single foot is parsed, which can be represented in two ways. One is to build a non-iteratively parsed bounded foot, as in
‧ 國
立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
(13) (Prince 1985); the other is to employ an exhaustive parsing of all the syllables, as in (14), in which case foot has more than two syllables, consisting of a head and any number of foot non-heads. This type of foot is termed an “unbounded foot,” whence the single-stress systems are known as “unbounded stress systems.”
(13) Single stress under the bounded foot-parsing a. [(ˈσσ)σσσσ…]
b. […σσσσ(σˈσ)]
(14) Single stress under the unbounded foot-parsing a. [(ˈσσσσσσ…)]
b. [(…σσσσσˈσ)]
This thesis is not to decide whether the bounded foot-parsing in (13) or the unbounded one in (14) is the correct way of parsing patterns with single stress. Rather, the foot- parsing in different cases will be analyzed individually on the empirical basis.
2.2.3 Quantity-sensitive Stress
Stress prefers to lodge on syllables which have a certain degree of intrinsic pro- minence. The relevant property is usually syllable weight (i.e. moraic quantity). Long vowels and vocalic diphthongs are always bimoraic, while coda consonants are mora- bearing on a language-specific basis, so (C)VC syllables may count as heavy in one language and light in another. Systems containing stress attraction by heavy syllables are called “quantity-sensitive” stress systems.
In unbounded stress systems that are characterized by quantity-sensitivity, stress falls on the leftmost/rightmost heavy syllable, and in the absence of heavy syllables, on the leftmost/rightmost syllable. Each of the four logical combinations of leftmost/
‧ 國
立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
rightmost in the statement corresponds to attested languages (see Hayes 1995:296ff);
the two cases in which the edges are the same are called “default to same edge” (DTS), and the two other cases in which the edges are different are called “default to opposite edge” (DTO), the terminology taken from Prince (1985).
A list of languages fitting each of these gross typological classifications is given in (15). The list is based on those of Hayes (1995:296-297), with additional languages from Walker (1996).
(15) Gross typological instantiations a. Default to same edge (DTS)
Leftmost heavy/leftmost: Amele, Au, Indo-European accent, Khalkha Mongolian, Lhasa Tibetan, Lushootseed, Mordwin, Murik, Yana
Rightmost heavy/rightmost: Aguacatec, Golin, Kelkar’s Hindi, Klamath, Sindhi, Western Cheremis
b. Default to opposite edge (DTO)
Leftmost heavy/rightmost: Komi Yaz’va, Kwakw’ala
Rightmost heavy/leftmost: Chuvash, Classical Arabic, Eastern
Cheremis, Huasteco, Kuuku-Yaʔu, Selkup
To take the first cases in (15a) and (15b) each as an example, the table in (16) eluci- dates the difference between DTS and DTO systems. What these two systems have in common in this table is that stress falls on the leftmost heavy syllable, if any. They differ in forms containing purely light syllables: stress falls on the leftmost syllable in DTS, but on the rightmost syllable in DTO. Note that Σ denotes a heavy syllable, σ a light syllable. Since the foot-parsing for unbounded stress systems is undecided, the
‧ 國
立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
forms in (16) are exempted from foot-bracketing.
(16) Illustrations of DTS vs. DTO
DTS DTO
Forms with heavy syllables σσˈΣσΣσ σσˈΣσΣσ Forms without heavy syllables ˈσσσσσσ σσσσσˈσ
An excurses: although usually a strict division into quantity-sensitive and quantity- insensitive systems is assumed, stress systems actually fall into finer-grained classes, showing various degrees of quantity-sensitivity, with a range of intermediate positions (Kager 1992a, 1992b, Alber 1997).
2.2.4 Stress Clash and Lapse
Stress languages present a preference for well-formed rhythmic patterns, where stressed syllables and unstressed syllables are spaced apart at regular intervals. This is manifested by avoidance of “stress clash,” or by avoidance of “stress lapse.” Stress clash results from adjacent stressed syllables, which can be formally defined in terms of metrical grid as a situation of adjacent strong beats without an intervening weak beat at the next-lower level (Liberman 1975, Liberman & Prince 1977, Prince 1983;
Selkirk 1984), as shown below.
(17) Stress clash n+1 x x
n x x
By contrast, a lapse is a sequence of unstressed syllables, which can be defined as the adjacency of two grid elements at level n, without either having a level n+1 counter- part, as in (18).
‧ 國
立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
(18) Stress lapse n+1
n x x
Stress clash and lapse are rhythmic outcomes, independent of foot structure. They are generally marked configurations so that stress languages often avoid them employing an alternating rhythm. Nonetheless, they may be less marked in some contexts. Chen (2000) provides the insight that clash is more tolerable at word-end. Kager (2001), on the other hand, hypothesizes that lapses become less marked in two positions: word- finally and adjacent to the main-stressed syllable. This notion of licensing restriction
Stress clash and lapse are rhythmic outcomes, independent of foot structure. They are generally marked configurations so that stress languages often avoid them employing an alternating rhythm. Nonetheless, they may be less marked in some contexts. Chen (2000) provides the insight that clash is more tolerable at word-end. Kager (2001), on the other hand, hypothesizes that lapses become less marked in two positions: word- finally and adjacent to the main-stressed syllable. This notion of licensing restriction