• 沒有找到結果。

Early Austronesians: Into and Out Of Taiwan

N/A
N/A
Protected

Academic year: 2021

Share "Early Austronesians: Into and Out Of Taiwan"

Copied!
29
0
0

加載中.... (立即查看全文)

全文

(1)

Early Austronesians: Into and Out of Taiwan

Albert Min-Shan Ko1, Chung-Yu Chen2, Qiaomei Fu1, Frederick Delfin1,3, Mingkun Li1, Hung-Lin Chiu4, Mark Stoneking1*, Ying-Chin Ko5*

1Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, D04103, Leipzig, Germany

2Institute of History and Philology, Academia Sinica, Nankang, Taipei, Taiwan

3DNA Analysis Laboratory, Natural Sciences Research Institute, University of the Philippines, Diliman, 1101, Quezon City, Philippines

4Institute of Anthropology, National Tsinghua University, Hsinchu, Taiwan

5Graduate Institute of Clinical Medical Science, China Medical University, Taichung, Taiwan

(2)

Abstract

A Taiwan origin for the expansion of the Austronesian languages and their speakers is well supported by linguistic and archaeological evidence. However, human genetic evidence is more controversial. Until now, there is no ancient skeletal evidence of a potential

Austronesian-speaking ancestor prior to the Taiwan Neolithic ~6000 years ago, and genetic studies have largely ignored the role of genetic diversity within Taiwan as well as the origins of Formosans. We address these issues via analysis of a complete mitochondrial DNA genome sequence of ~8000 years old skeleton from Liang Island (located between China and Taiwan) and 550 mtDNA genome sequences from 8 aboriginal (highland) Formosan and 4 other Taiwanese groups. We show that the Liangdao Man mtDNA sequence is closest to Formosans, provides a link to southern China, and has the most ancestral haplogroup E sequence found among extant Austronesian speakers. Bayesian phylogenetic analysis allows us to reconstruct a history of early Austronesians arriving in Taiwan in the north ~6000 years ago, spreading rapidly to the south, and leaving Taiwan ~4000 years ago to spread throughout Madagascar, Island Southeast Asia, Madagascar, and Oceania.

(3)

Introduction

The expansion of the Austronesian language family, one of the world’s largest expansions across Island Southeast Asia and Oceania, began in Taiwan.1 However, genetic evidence has

been equivocal, with some mitochondrial DNA (mtDNA) studies showing a minor genetic contribution from Taiwan in extant Austronesian-speaking groups.2Moreover, based on a

limited panel of Y-chromosome and autosomal STR loci,3,4 Taiwan aboriginals have been

traced to be genetically close to the Daic speakers from southern China, h. However, based

on contemporary populations this only suggests that they share most recent ancestry, as there

are no Austronesian speakers in China and linguistic evidence further argues that itthe link

with Daic speakersismay be a result of back migration of south Formosan ancestors.5

CruciallyMoreover, there is a lack of relevant fossil material pertaining to the origin or the

genetic diversity of the various groups of aboriginal Formosans to demonstrate a genetic continuity, nor has information concerning genetic diversity among the various groups of aboriginal Formosans been incorporated into investigations of the Austronesian expansion ..

3; 45Here, we combine ancient DNA analysis of a new, relevant skeleton (Liangdao Man;

Figure 1) with complete mitochondrial DNA (mtDNA) genome sequences from 550

individuals that include 8 aboriginal (highland Formosan) groups to investigate the peopling of Neolithic Taiwan, and the origins of the Austronesian expansion.

The Liangdao Man skeletal remains were discovered on the Liang Island of the the Matsu archipelago on December 2011 and transported to the Matsu Folklore Museum. Matsu is located on the Min River estuary, 24 kilometers from Fujian and 180 kilometers northwest of Taiwan (Figure S1).,anItd represents one of the shortest crossings from the mainland into Taiwan. The skeleton is 70% complete and that of a robust male about 160 centimeters in height. The C14-AMS dating of a thoracic rib yields a date 8060-8320 Cal BP (at 95% probability; Figure S3Supplementary Text). The Liangdao Man has a shell mound above it that contains artifacts such as pottery, stone tools, and bone tools. The radiocarbon dates of the shells and charcoal from the layers above the Liangdao Man range from 7500-7900 years ago (ya), verifying the burial to be from the earliest phase of the shell mound. Thus, both the

(4)

age and location of Liangdao Man position it at an appropriate place and time and place to potentially provide insights concerning the early Austronesian expansion into Taiwan, as well as a genetic link to southern China.3

Material and Methods

A total of 565 individuals from 12 ethnic groups were recruited in Taiwan from 1998 to 2001. We included 8 highland Formosan tribes: Ami, Atayal, Bunun, Paiwan, Puyuma, Rukai, Saisiat and Tsou, that represent the branches of Blust’s linguistic classification6 and

relate to Proto-Austronesian, a reconstructed linguistic ancestor (Figure S186). We further included 4 other groups: Makatao (Sinicized lowland Formosan), Tao (Orchid Islanders offshore Taiwan), and Han Taiwanese (Hakka and Minnan, who migrated to Taiwan from Fujian and Guangdong less than about 300 ya). The demographic information (Table S1) and sampling locations (Figure S1) are provided. All aboriginal samples are collected at major township centers, and ancestry ascertained by requiring both parents to be in the same tribe. Genomic DNA was extracted from whole blood using the QIAGEN-Gentra Puregene Blood Kit following laboratory protocols. The ethics committees ofthe China Medical University,

the Taiwan National Health Research Institutes ethics committee, and the ethics committee of the University of Leipzig Medical Faculty have approved this study. Informed consent was obtained from all participants. For details of the Liangdao Man archeology refer to Appendix

A, refer to the Supplementary Text.

DNA sequencing was performed on the Illumina platform. DNA from the Liangdao Man foot phalanx was prepared extracted using and a single-strand library prepared and enriched and for mtDNA capture (Appendix B)., and for For the modern samples, double-indexed libraries were prepared and enriched for mtDNA (Supplementary TextAppendix C). Haplogroup The haplogroup E phylogeny was constructed by the median-joining method implemented in Network 4.611.7 BEAST 1.7.58was used to estimated population sizes changes over time

under the Skyride model9 with the HKY85+Γ (four categories) substitution model as the best

fit as determined by jModeltest210. and A strict clock using a whole mtDNA substitution rate

of 1.665×10-8 per site per year11 was used; and and all runs were carried out to an ESS >103.

(5)

discriminant analysis of principal components (DAPC)12 was used that maximizes between

group differences and minimizes within cluster variation based on alleles. The retained discriminant functions give probabilities for individual membership among groups. The coefficient measures how close individuals are to the 10 assigned ethnic clusters (8

highlanders highlander groups and 2 Han groups). We simulated three model scenarios (‘Into Taiwan’, ‘Out of Taiwan’ and ‘Formosan phylogeny’; Appendix DSupplementary Text) using fastsimcoal13 and parameters inferred via Approximate Bayesian Computation (ABC).14

The accession numbers for the complete mtDNA sequences reported in this paper are KF540505 (Liangdao Man) and KF540506-KF541055 (550 sequences from 12 Taiwanese

groups).

Results

We extracted DNA from a foot phalanx of the Liangdao Man, and completely sequenced the mtDNA genome at 245-fold coverage (Supplementary TextAppendix B). The ancient mtDNA is haplogroup E (with two of the four diagnostic changes towards haplogroup E1; Figure S2). Figure 3 shows aA comparison of the Liangdao Man sequence with its 104 modern descendantssequences from haplogroup E and the nearest extant relative, haplogroup

M9, (Figure 2), collected from this study, Philippines,15,16 Malaysia,17 Indonesia18,19 and

Melanesia,20 . reveals that tTwo aboriginal Formosan haplotypes sequences are have the

closest match to the Liangdao Man sequence, that is,with four nucleotide differences in the mtDNA protein coding region. Furthermore, E isit is a sub-clade of haplogroup M9, which is currently found in southern China.21.2021

Haplogroup E has been previously dated to more than 30,000 (30 ka) using the rho method and a constant molecular clock.19 By contrast, Bayesian dating, via ancient DNA calibration22

and using the direct age of the Liangdao Man, indicates that haplogroup E likely arose 8136-10,933 ya (95% highest posterior density, HPD; Figure S103), illustrating the value of incorporating ancient DNA information into molecular dating. where it has beenwas

previously dated to more than 30,000 (30 ka) using rho method under the constant molecular clock.19 This results in a recalibration of the molecular clock and positive support for the relaxed over strict clock (log10 Bayes Factor=6.72) where the averaging substitution rate among branches for the entire mtDNA genome is 2.67×10-8 (2.13-3.16×10-8, 95% HPD) per

(6)

site per year. This rate is similar to and separately confirmed by calibration of the worldwide mtDNA phylogeny using an extensive collection of archaic humans and other ancient DNA.23

The Bayesian skyline plot offor haplogroup E lineage shows population expansion 8102 ya,

where an effective female size of 125 (14-485, 95% HPD) increased to a present-day size of

6991 (2274-17,796, 95% HPD; Figure S11). Similar calibration of the 361 complete

Formosan mtDNA sequences shows decisive support for the relaxed clock

(log10 Bayes Factor=42.43) where thewith a substitution rate isof 2.68×10-8 (1.91-3.16×10-8 ,

95% HPD) per site per year. Figure S10 indicates that the ancestral population of Formosans

expanded 10,835 ya from an effective female size of 4867 (172-21,868, 95% HPD) to a

present-day size of 34,027 (14,882-73,492, 95% HPD).

We completely sequenced mtDNA genomes at 500-fold coverage from 550 Taiwanese, that include including 8 highland Formosan groups closest to the root of Austronesian languages6,

2 lowland Formosan groups and 2 Han groups. We find a complete lackThere is a complete lack of haplotype sharing between the Han and Formosan mtDNA sequences,to suggesting negligible gene flow (Figure 43B). We first carried out an ‘Into Taiwan’ simulation, based on the divergence between Han and aboriginal Formosan groups (Appendix D). Based on this, the ‘Into Taiwan’This simulation indicates that and estimation of tthey last shared common

ancestryheir divergence via the ‘Into Taiwan’ simulation is between 8093-10,306 ya (95%

HPD; Table S2Supplementary Text). Within Taiwan, we find strong evidence of a north-to-south gradient in the patterns of Formosan mtDNA nucleotide diversity (Figure 4A). The highland Formosans have higher mean nucleotide differences in the north than south, and a ‘tail’ in their mismatch distributions (Table S19) and support for a spatial expansion model (Table S7) that is consistent with their mountain residence in highly heterogeneous

environments24. Similarly, Bayesian skyline plots show all highlanders have similar

bottlenecks,to suggesting that they all tribes have split from a group of early Austronesians

(Figure S19). In contrast, the Makatao (lowland Formosan) is demographically similar to Han

Taiwanese in termsin exhibiting a stationary (demographic) expansion (no ‘tail’) and high

intra-deme flow (94-430 per generation; Table S7),andas well as population expansions in

their Bayesian skyline plots (Figure S19). As postulated,These results suggest that in contrast

to highlanders, the lowlanders have admixed more with Han and are thus less isolated (hence and have experienced more migration) than the highland groups.

(7)

We proceed tonext reconstructed the order of splitting of highland tribes from each other

using the ‘Formosan phylogeny’ simulation (Appendix D), which is an unbiased search of all

possible tree shapes formed by the eight tribes thatfor those that provide the best fit to the

observed data. The details of the performance of the ABC simulations are provided, and there

is generally aindicate a generallysatisfactoryimproved fit of the posterior distribution over to

the observed (Appendix D). We find tThe best tree conforms to the observed geographic

cline as well as the temporal relationships of regional sequences (Figure S16). Finally, using

theThe ‘Out of Taiwan’ simulation (Appendix D), we simulation estimate an original split

between the ancestors of aboriginal Formosans and those of Filipinos from northern Luzon to have occuredoccurred 13,725-29,601 ya (95% HPD) (Table S2). of the earliest of a

unidirectional This was followed by a migration from Taiwan to the Philippines migration offrom Taiwan aboriginal haplotypes into the Philippines is likely occur between 3825-4450

ya (95% HPD) with a higher mean probability of about 4.1-4.2 ka (Table S2Supplementary

Text). Additionally, we find strong evidence of a north-to-south gradient in the patterns of

mtDNA nucleotide diversity, and reconstructed the order of split of highland tribes from each other using the ‘Formosan Phylogeny’ simulation, which is an unbiased search of all possible tree shapes formed by the eight tribes that best fit to the observed (Figure 3). The details of the performance of the ABC simulations are provided, and there is generally a satisfactory fit of posterior distribution over observed (Supplementary Text).

Discussion

Haplogroup E is not observed in over 6000 individuals across 84 populations in China,25 thus it it makes the occurrence of this haplogroup at the Liangdao Man’s location is highly

unusual. In fact, haplogroup E is identified to be prevalent outside China among

Austronesian-speaking groups spanning from Taiwan to Island Southeast Asia.17,24

haplogroup E is prevalent outside China among Austronesian-speaking groups from Taiwan,

Philippines,16 Malay Peninsula,17 Island Southeast Asia,19,26 Guam and Marianas in

Micronesia,27toand spread as far west as Madagascar,28,29 and as far east as the Bismarck

Archipelago; in Melanesia, however it has not yet been reported in Polynesia.30The evidence

linking it this haplogroupu to Taiwan is that it is among one of the mid-Holocene maternal lineages, specificallyfor example, E1a1a, M7b3, M7c3c, and Y2 that aret areimportant

(8)

A Formosan source for the Liangdao Man is unlikely, because haplogroup E evolved from haplogroup M9, which has never been detected in over 1000 Formosan mtDNA sequences from this study and published data.32 Instead, haplogroup M9 is distributed along coastal

China in the regions close to the Liang Island, such as in the Yangtze Valley region and Zhejiang.21 It This suggests that M9 differentiated to E outside China (near Fuzhou) and that

haplogroup E and sublineages are specifically associated with early Austronesians and the subsequent dispersal of Austronesian languages (Figure 54B). This agrees with the seafaring

culture of early Austronesians31 and absence of Austronesian languages in China. Further

support for this view is that on Mainland mainland Southeast Asia, the E lineages are found in the Austronesian-speaking Cham but not in Austro-Asiatic, Tai-Kadai, or Tibeto-Burman speakers.33As Cham have been traced to originate from Borneo,34 which may be one

linguistic source of Malagasy, and indeed, E lineages have been detected at 10% frequency in

the Malagasy.28,29

At around the time when haplogroup E first developed from M9 in the population that later carried it into Taiwan, the surrounding regions gave rise to M9a* lineages that are now carried by Sinitic speakers.21 Thus, the Han, Liangdao Man, and Formosanbelonging to

haplogroup M9a/E lineages can be traced to an ancestral M9 mtDNA lineage (Figure Figure 32). Additionally, the Tibetans have a high frequency of M9a lineages that has been shown to coalesce during the Neolithic35 and there is a hypothesized linguistic link between

Sino-Tibetan and Austronesian languages.5

The archeology of cereal crop cultivation in south Taiwan is strongly linked to their origin in China.36,37 The earliest domestication of the foxtail millet is 9.5-11.5 ka in northern China38

and rice 8.2-13.5 ka in the Yangtze Valley.39. Because the co-occurrence of agricultural

developments and population growth is well-known,40 we investigated the Bayesian skyline

plots of the haplogroup E sequences (Figure S11) and of the highland Formosan mtDNA sequences (Figure S124) andand of the haplogroup E sequences (Figure S5) and find evidence of population expansions aboutthat occurred 8-10 ka. It suggests that as theThus,

as early Austronesians diverged from Han ancestors and expanded into Taiwan, haplogroup E

(among other maternal lineages,) increased in frequency The resemblance of the Liangdao

Man and highland Formosan, in terms of haplotypes, demographic history, and time of ancestral origin, supports the Liangdao Man as an important trace of the maternal ancestors

(9)

of highland Formosan in southern China as the population expanded and diverged from the Han ancestorsoutside China..

As early Austronesians migrated and arrived on the mainland opposite Taiwan, the Fuzhou basin was flooded around 9 ka.41 The region is less conducive to farming and could have

motivated an exploration of shell resources, such as in the case of Liang Island.

Archeological evidence indicates that Neolithic Taiwan was settled 6 ka.42 While more

additional ancient DNA data would benefit in the Bayesian inference, the emerging picture is that the majority of E sublineages have a coalescence of 5-8 ka with a higher mean

probability of about 6 ka (Figure S103). Taken together, the entry into Taiwan is likely from the north, as the Liang Island and origin of cereal crops used by aborigines are northward of the island (Figure 54A). The incoming direction also matches theour genetic findings that Saisiat and Atayal (northernmost tribes) have the highest mtDNA diversity (Figure 43C), and

are involved in the deepest splits among highlanders highlander groups (Figure 43D).

In Taiwan, the early Austronesians dispersed southwards (Figure 43C). The ‘Formosan phylogeny’ simulation reveals the northern group diverged first at 5332 ya (4975-5638, 95% HPD). This coincides with the root estimate of Austronesian languages at 5230 ya (4750-5800, 95% HPD) using archeological settlement times outside Taiwan.1 The next split,

between the central and southern groups occurred 4226 ya (4049-4452, 95% HPD). The ‘Out of Taiwan’migration, corresponding to the earlyinitial Proto-Malayo-Polynesian speakers (who left Taiwan), has a higher mean probability of occurring at 4.1-4.2 ka, remains which is consistent with the archeological record of an early contact with the Philippines at 4 ka.43

However, at this time there was a single population moving through Taiwan since none of the Formosan tribes had yet formed. The genetic results thus suggest a rapid dispersal through Taiwan, followed by subsequent population differentiation.

We also find a good correspondence between the genetic relationships and the various linguistic models of population relationships (Figure S186). Particularly, the three regional, early Austronesian branches (north, central, south; Figure 43D) reflect Sagart’s Proto-Austronesian numeral system that also describes a stepwise progression down the island.5

The merits of other models are noticed under a curious genetic relationship in that Formosan tribes diversified in the reverse direction to the southward expansion. That is, the southern, central, northern tribes are established with mean probabilities of 3376-1383 ya, 2281 ya, and

(10)

1248 ya, respectively (Figure 43D). Thus, languages in south Taiwan may have existed for a longer period of timediversified before those in the north, andwhich couldrelateing to the initial branches of the models of Li4438 and Ross38; 39.45 Importantly, Blust’s polytomy6 of a

single ancestor also receives support in the genetic data, in that all highlanders share a highly

similar demographic history (Table S72 and Figure S197), and it this model explains more of the genetic variation among groups than the other models (P<0.001; Table S83).

For many years, it has been challenging to define early Austronesians beyond reconstruction of languages. To be sure, we cannot know what language Liangdao Man spoke, nor if he had anything to do with the spread of rice and millet agriculture that is usually associated with the early Austronesian expansion. However, ancient DNA verifies that he carries an ancestral haplogroup E mtDNA sequence that strongly links him with contemporary Formosans. Thus, the Liangdao Man is the oldest genetic relative of aboriginal Formosans. Furthermore, his lineage traces back to ancestral M9 lineages along coastal China. Aided by his sequence, we improve the calibration of the mtDNA clock that documentsvia model-based simulations we estimate the initial divergence between Formosans and Han to be around 8-10 ka,

colonization of Taiwan in the north followed by a north-to-south dispersal through Taiwan, and an exit of Proto-Malayo-Polynesian speakers from the south at about 4 ka. The Liangdao Man mtDNA sequence, complemented by detailed analyses of Formosan genetic diversity, thus provide more details of the process by which humans spread to, through, and out of Taiwan.

Appendix A: Liangdao Man archeology

Liangdao Man was exhumed from Liang Island, the northernmost satellite island in the Matsu Archipelago. Matsu is located 24 kilometers from Fujian of Mainland China and 180

kilometers northwest of Taiwan. Liang Island has an area of 0.35 square kilometers (1400 meters long and 250 meters wide) with sparse vegetation and is populated by migratory birds. There are marine resources (shells and fish) in the nearby waters. Liang Island is uninhabited

as Matsu ishas been under Taiwanese martial law for over fifty years. The ancient skeleton

was found by accident as a result of road construction when the central road cut along a ridge

hill hadthat exposed a spread out shell mound layer about 3 meters wide by 30 meters long by 10-50 centimeters in height (Figure S2). Inspection had revealed shells, pottery shards, bone

(11)

tools, and three pieces of human parietal bone. An later excavation revealed a skeleton in situ. This archeological site is termed the Liangdaodaowei-1 (LDDW-I). The skeleton is assessed

to be

70

% complete, and male (determined from a narrow greater sciatic notch and lack of

subpubic concavity), about 30 years old (determined from degree of molar wear), the height is 160±3.59 centimeters, with robust humerus and developed deltoid tuberosity, and

prominent ridgeline for insertion of muscle and thicker body of the bone. To determine the age, a thoracic rib was sent for C14-AMS dating (Figure S3). Additionally, the Matsu Liangdao archeology team (managed by Prof Chung-Yu Chen) sent the non-human

specimens (shells and charcoal) from layers above the Liangdao Man, prior to its full

exposure, to the Valuable Instrument Center Laboratory of National Taiwan University for

radiocarbon C14 dating, and the age range of those artifacts is 7500-7900 BP.

Appendix B: Ancient mtDNA sequencing

DNA extraction, library preparation and mtDNA enrichment

The DNA was extracted as described previously46 from a foot phalanx and femur from

Liangdao Man using 59 and 15 mg of bone powder respectively. Libraries were produced

from 10µl of each extract using a single strand library preparation method.47 To prevent

contamination from sequences derived from modern DNA libraries, adaptor CL53/73 (CL53 CGACGCTCTTC-ddC (ddC = dideoxy cytidine); CL73

Phosphate-GGAAGAGCGTCGTGTAGGGAAAGAG*T*G*T*A (* = PTO bonds)) were used for

libraries. An optimal PCR cycle number for library amplification was determined by qPCR.48

Libraries were amplified with AccuPrime Pfx DNA polymerase (Life Technologies) with

reaction parameters described previously49 and with distinct sample-specific internal barcodes

introduced into both library adaptors.50 The MinElute PCR purification kit (Qiagen, Hilden,

Germany) was used to purify the PCR-amplified libraries. To obtain high concentration DNA libraries for the mtDNA hybridization capture, a second round of amplification in a 100 ul PCR reaction was performed with Herculase II Fusion DNA polymerase (Agilent) using the

primers IS5 and IS651 and the conditions described previously.49 The hybridization capture

with an mtDNA probe set52 was performed to enrich the mtDNA in libraries. The libraries

were amplified with primers IS5 and IS6.51 Library concentration was determined using a

(12)

Illumina sequencing

The pooled libraries were sequenced on a fifth of one lane of the Illumina MiSeq (MS-102-1001 MiSeq Reagent Kit (300-cycles - PE)(version 1)) using a paired-end run with 76 + 7

cycles and two seven base pair index reads.50 An indexed control PhiX 174 library was

spiked-in to yield 2-3% control reads (index 5'-TTGCCGC-3'). Base-calling was performed with Bustard (reference) applying a cycle independent correction for cross-talk, followed by the correction of phasing and pre-phasing. A minimum base quality score of 10 was required in both index reads. The full-length molecule sequences were reconstructed by merging the paired-end reads with the requirement that the forward and reverse sequence reads

overlapped by at least 11 bp.53 The adaptors were then removed and these sequences were

used for further analysis.

Ancient mtDNA assembly

The total sequencing of the libraries from phalanx and femur yielded 174,426 and 550,722 merged reads, respectively. After filtering with map quality filter 30 and length filter 35, 51,466, and 484,491 of the reads from these two libraries could be aligned to the revised

Cambridge Reference Sequence (rCRS; NCBI reference sequence: NC_012920.154) using an

iterative mapping assembler (MIA)55 with a position-specific scoring matrix that handles the

nucleotide misincorporation patterns found in ancient DNA sequences. To remove PCR duplicates, we built a consensus from sequences with identical start and end coordinates by retaining the base with the highest quality score at each position in the alignment. The

average length of the Liangdao DNA molecules is 50 bp (Figure S4A). The mtDNA coverage as determined from unique sequences is 245.9 fold for the phalanx and 32.9 fold for the femur. The consensus sequences obtained from both samples are identical. Since the phalanx

shows the better preservation of the two samples (163.9 fold coverage/mg bone on average),

the library prepared from Liangdaothe phalanx was used for further mtDNA analysis.

Previous studies reported a GC-bias in sequences generated from libraries prepared with the

double stranded library method56 and an AT-bias in the single strand library method.47 Here

we also see that bias decreases with increasing fragment size (Figure S4B). With the single-stranded library preparation method, we find a similar decrease in GC-content with fragment size, but overall the GC-content is similar to the genome average (42%) throughout the range of fragment sizes.

(13)

The authenticity of this mtDNA genome sequence was determined in two steps: using the majority base at each position of the sequence, and the damage pattern.

1. Majority base: In order to check for contamination and/or errors induced by nucleotide

misincorporations, the coverage for each position and the proportion of reads that matched the consensus base at each position are plotted in Figure S5. The average

frequency of the majority base at each position is 98.6 %. The lowest coverage across the mtDNA genome is 25 fold. The consensus support was below 80% for only 2 out of 16,566 positions. One of these was incorrectly aligned, and the other one showed 11 out of 50 sequences with a C ->T mismatch close to the end, strongly suggesting that these substitutions represent nucleotide misincorporations due to cytosine deamination. Hence, the majority of the mtDNA sequence is from the same individual. To assess more

thoroughly if the mtDNA comes from a single individual, we focused on one position where the Liangdao consensus sequence differs from 99% of 311 complete mtDNA

sequences from around the world.57 Among 98 distinct DNA fragments that cover this

position, none differs from the consensus sequence, indicating the vast majority (96.2-100%, 95% CI) of the DNA fragments come from one mtDNA genome.

2. Damage pattern: To assess the damage information, we focus on the C -> T changes at

the 5’ -ends and 3’ –ends in the single stranded DNA, especially CpG -> TpG substitutions. The ratio of CpG -> TpG substitutions at both fragment ends (>50%) indicates the presence of 5’ and 3’ single stranded DNA overhangs carrying many

5-methyl-cytosines, which is a characteristic of ancient DNA47 (Figure S6).

Thus, we conclude that the mitochondrial genome sequence of the Liangdao specimen is likely to be derived from one ancient individual.

Appendix C: High-throughput sequencing of Taiwanese mtDNA genomes

A total of 565 samples were processed for high-throughput sequencing on the Illumina GAIIx

platform (Illumina Inc., San Diego CA) as described previously.15 In total 255,160,433 reads

were generated, of which 62,368,213 reads mapped to the rCRS. Four highland Formosan samples (two Saisiat and two Tsou) and 11 non-Formosans (five Hakka and six Tao) did not have sufficient reads and were removed from further analysis. Of the 550 remaining samples, 10% were randomly re-sequenced and these yielded identical consensus sequences. The mean coverage was 504-fold per sequence (99.5% samples >10-fold coverage; Figure S7). This

(14)

allowed us to achieve a high quality dataset that was only missing 2.7% (26/949) of the total

polymorphic sites, and these sites were removed from all sequences during analysis. FDuring

further sequence cleaning we removed 81 sites: poly-C stretch of hypervariable segment 2

(HVS-II; nucleotide positions, np 303-317); CA-repeat (np 514-523); C-stretch 1 (np 568-573); 12S rRNA (np 956-965); historical site (np 3107); C-stretch 2 (np 5895-5899); 9-bp deletion/insertion (np 8272-8289); and poly-C stretch of hypervariable segment 1 (HVS-I; np

16180-16195). All sequences were aligned to the rCRS using MUSCLE.58 Haplogroups were

assigned by Haplogrep59 based on Phylotree v15.60

To examine the quality of the Taiwanese dataset, we performed 10,000 resamplings of

haplotypes/haplogroups at incremental sample sizes across tribes to observe whether, for example, the lower sample size of the Saisiat tribe affects the results. Figure S8 shows that

the Saisiat curve does increase more rapidly (especially for haplogroups), however overall the

curves do not differ among groups. Moreover, the Formosan curves are more saturated than the Han curves, suggesting increased representation of Formosan sequences. Another

indication of adequate sampling of highlanders is the association between haplotype diversity

and census size61 per tribe for 1906, 1964, and 2004 (Spearman rho~0.63 and P=0.045, 0.045,

0.049, respectively). We infer that the introduction of ‘foreign’ haplotypes via recent

migration has not greatly influenced patterns of highland Formosan mtDNA diversity. Figure

S9 shows the mtDNA genetic distances of highlander tribes do cluster geographically and to

also correspond to a previous study of Taiwan aboriginal studymtDNA variation.32 For the

latter, both studies are scaled to 713 bp that encompass HVS-I and HVS-II, and

between-study tribal ΦST ranges from 0.005 to 0.07. We find the studies are highly comparable, with

slight increased between-study variation in southern Formosan groups due tothat probably

reflect differences in sampling locations.

Appendix D: Approximate Bayesian Computation Simulations

Simulation 1: Into/Out-of-Taiwan

To reconstruct the history of Neolithic Taiwan, we divided the major events into 3 stages: Into Taiwan, Formosan phylogeny, and Out-of-Taiwan. For Into Taiwan, the Han (n=95) and

(15)

model assertsassumes an ancestral population split into two populations at time (T) in the past. We modeled for a population expansion as indicated by the significantly negative

Tajima’s D values for the Han and Formosan (-2.34, and -1.64, respectively) and by the

Bayesian skyline plots (Figure S11 and S19). The ancestral sizes are calculated as a

factorfraction (range 0 to 1) of the present-day size but not exceeding it. For Out-of-Taiwan,

the Formosan mtDNA sequences (n=361) from this study and Filipino complete mtDNA

genomes (n=67)ourfrom a recent publishedstudy16Philippine complete mtDNA genomes

(n=67)arewere used. To capture the signal of ancient migration from Taiwan to the Philippines, we selected six northern PhilippinesFilipino groups (Ivatan, Ifugao, Ibaloi,

Kankanaey, Kalangoya, Bugkalot) of non-Negrito decentancestry that isare geographically

closest to Taiwan. Figure S13 shows tThe model (Figure S13) assumes a distant ancestor that

split into two at time (T1) and recently,followed by a unidirectional migration (m)

representing out-of-Taiwan at time (T2) in the past. We also modeled aallowed for population

expansion as indicated by the significantly negative Tajima’s D values (-1.63, -1.30,

respectively) and Bayesian skyline plots16 (Figure S11).

The choice of priors (Table S2):

1. T, set between the Neolithic settlement of Taiwan 6 ka42 and the earliest domestication of

the foxtail millet about 12 ka in north China.38

2. Han Ne, the entire Han population is about 20 million in Taiwan, however we sampled

only from south Taiwan, so we estimate an effective size of 2 million divided by 6 (half

each from males and females then a third as reproductive individuals in the population), so a prior range up to 330,000.

3. Formosan Ne, since we sampled throughout Taiwan, the maximum effective size is based

on the 2004 census of total aborigines of about 500,000 divided by 6.

4. Philippines Ne, the total number of indigenous peoples is about 12 million from 2the 005

census, however since sampled only north Luzon, we estimate a maximum effective size

of 1 million divided by 6.

5. Ancestral Neor, as explained, is relative toestimated as a fraction of the descendants (to

allow for population expansion) and thus has the an upper limit of the highest effective size.

6. T1, the non-Negrito indigenous population of the Philippines are Austronesian speakers

so we assume they last share ancestry anytime from the Neolithic Taiwan 6 ka42 up to the

(16)

7. T2, anytime from the present to the Neolithic settlement of Taiwan 6 ka.42

8. m, migration rate ranges from 0 to 1.

9. μ, based on credible intervals of the whole mtDNA genome rate calibrated to archaic

humans and other ancient DNA under the relaxed clock method.23

For Into Taiwan, 5,568,411 observations were simulated then log transformed for ABC local

linear regression14 using 7 statisticalsummary statistic categories: number of haplotypes,;

haplotype diversity,;number of polymorphic sites,; Tajima’s D,;mean number of pairwise

differences,; pairwise Φst,; %variation among groups and within populations from the

analysis of molecular variance (AMOVA). We used Tajima’s D to specifically capture the signal of population expansion. For Out-of-Taiwan, 5,159,957 observations were simulated following the same procedure, except we also introduced the pairwise haplotype sharing

statistic to capture the migration between the FormosanTaiwan and the Philippines. In both

instances, we retained the top 10,000 simulations (tolerance of 0.2%) as this most highly

correlated with the 1000 pseudo-observed values, as described previously.63 Table S2 shows

all R2 > 10% to indicate our parameters are estimated reliably, where 10% is suggested to be

the threshold below which the parameter is unreliably estimated.63 The average coverage

(proportion of true within estimate) is 81% and the average factor 2 (proportion of estimate within 50-200% of true) is 95%. The high overlap suggests convergence. The average estimates of divergence times have 3% bias and 15% relative mean square error (RMSE),

effective sizes have 9% bias and 15% RMSE, and the substitution rate has 2% bias and 4%

RMSE. The parameter with highest variability is the migration rate. Table S3 shows the posteriors fit the observed better than the priors, e.g. for the Into Taiwan, the fit of Pi is 20-40 times better in terms of bias and RMSE, and that for Tajima’s D is about 6 times better. For Out-of-Taiwan, the fit for Tajima’s D is about 3 times better, and that for pairwise haplotype sharing is up to 5 times better.

Simulation 2: Formosan phylogeny

Previously,For the previous simulations the the Formosan mtDNA sequences (n=361) were

combined, here they are split into eight branchesgroups based on language. We generated all

possible trees so to conduct an unbiased and exhaustive search of the best tree (i.e., branching

order and branch lengths), then re-simulate that tree for parameter estimates (effective sizes,

(17)

possible solutions are generated for bifurcating rooted trees with labeled leaves given by the formula,64 for n≥2:

❑ ❑ ( ) ( ) ❑❑( )

ThereFor the 8 Formosan groups there are 135,135 unique trees. To simplify, we calculated 423 fixed topologies that encapsulate permutations of leaves in each topology. Every tree was reiterated 200 times for a total of 27,027,000 stochastic draws. For the first step, we

generated 17,596,318 observations (not every tree converged to a TMRCA, however every

tree occurred at least 20% of the time) using unbiased, uniform priors, where only 6

coalescent times change (since highest timethe first divergence was fixed at 6000 BP) and

each Formosan tribal sizes are set to 10,000 (same prior for as simulation 1, here 80,000

divided by 8 tribes). We filtered by pairwise haplotype sharing (informative about the

north-south cline) that has 64 values (private and shared haplotypes for 8 tribes) and thus offers

high specificity. Using this, we retained 8 candidate shapes after removeing those outside

90-110% of true value, and these were: 143 (#44), 149 (#62), 272 (#899), 275 (#112), 296 (#719), 296 (#721), 349 (#39), and 351 (#447). The first number is the topology (1-423) and

the hash is the specific permutation of leaves, e.g. topology 149 has 1261 permutations that

are thus labeled #1 to #1261 (trees are not shown, information is available on request).

Finally, to narrow to the best tree, we required that Saisiat must show the highest pairwise

difference (Figure S14).

The choice of priors (Table S5):

1. T1-T7, based on the Neolithic settlement of Taiwan, is at most 6 ka.42

2. SAI-PAI Ne, Formosan population of about 500,000 divided by 6 (estimate of

reproductive individuals) then averaged over 8 tribes.

3. m, based on inter-tribal marriages are <5% from public health surveys.65

4. μ, based on credible intervals of the whole mtDNA genome rate calibrated to archaic

humans and ancient DNA under a relaxed clock.23

We determined the best tree to be north, i.e. 149 (#62) as opposed to central diverging first 296 (#721) because the observed data shows that north has the highest nucleotide diversity. Re-simulations of 149 (#62) now show improved fit to nucleotide diversity (Figure S15).

(18)

Figure S16 further confirms the authenticity of 149 (#62) using BEAST to show that Bayesian posteriors match the cline in nucleotide diversity. For step 2, tree 149 (#62) was

simulated for 5,793,246 observations then log transformed for ABC local linear regression14

using 4 statisticalsummary statistic categories: number of haplotypes,; haplotype diversity,;

mean number of pairwise differences,; and number of polymorphic positions. We retained the

top 1000 simulations (tolerance of about 0.02%) where the log Euclidean<1 and where a high

correlation with 1000 pseudo-observed values was obtained. Table S5 shows all parameters

are estimated reliably (R >0.1). There is a high overlap between true values and estimates 2

(average coverage=97% and average factor 2=89%). The average bias is 11% (divergence times vs. effective sizes is 5% vs. 13%) and average RMSE is 42%. Figure S17 and Table S6

demonstrate that posteriors better fit the observed values than the priors. For example, the

average bias and RMSE, for S is 16-20 times better, and for Pi is 20-30 times better.

Separately, wWe also noticed that the pooled mean effective tribal sizes, by region (north is

2226, central is 3179, and south is 4610) isare consistent with the 2004 census in that more

aborigines reside in south Taiwan (Spearman correlation to the 2004 census is marginally significant P=0.08).

(19)
(20)

References

1. Gray, R.D., Drummond, A.J., and Greenhill, S.J. (2009). Language phylogenies reveal expansion pulses and pauses in Pacific settlement. Science 323, 479-483.

2. Soares, P., Rito, T., Trejaut, J., Mormina, M., Hill, C., Tinkler-Hundal, E., Braid, M., Clarke, D.J., Loo, J.H., Thomson, N., et al. (2011). Ancient voyaging and Polynesian origins. American journal of human genetics 88, 239-247.

3. Mirabal, S., Cadenas, A.M., Garcia-Bertrand, R., and Herrera, R.J. (2013). Ascertaining the role of Taiwan as a source for the Austronesian expansion. American journal of physical anthropology 150, 551-564.

4. Li, H., Wen, B., Chen, S.J., Su, B., Pramoonjago, P., Liu, Y., Pan, S., Qin, Z., Liu, W., Cheng, X., et al. (2008). Paternal genetic affinity between Western Austronesians and Daic populations. BMC evolutionary biology 8, 146.

5. Sagart, L. (2008). The expansion of Setaria farmers in East Asia: A linguistic and archaeological model. In Past Human Migrations in East Asia: Matching

Archaeology, Linguistics and Genetics R.B. A Sanchez-Mazas, M Ross, I Peiros, M Lin, ed. (Routledge.

6. Blust, R. (1999). Selected papers from the Eighth International Conference on Austronesian Linguistics. In Symposium series of the Institute of Linguistics, Academia Sinica, Taipei, Taiwan, E. Zeitoun and P. Li, eds., pp 31-94.

7. Bandelt, H.J., Forster, P., and Rohl, A. (1999). Median-joining networks for inferring intraspecific phylogenies. Molecular biology and evolution 16, 37-48.

8. Drummond, A.J., Suchard, M.A., Xie, D., and Rambaut, A. (2012). Bayesian

phylogenetics with BEAUti and the BEAST 1.7. Molecular biology and evolution 29, 1969-1973.

9. Minin, V.N., Bloomquist, E.W., and Suchard, M.A. (2008). Smooth skyride through a rough skyline: Bayesian coalescent-based inference of population dynamics. Molecular biology and evolution 25, 1459-1471.

10. Darriba, D., Taboada, G.L., Doallo, R., and Posada, D. (2012). jModelTest 2: more models, new heuristics and parallel computing. Nature methods 9, 772. 11. Soares, P., Ermini, L., Thomson, N., Mormina, M., Rito, T., Rohl, A., Salas, A.,

(21)

selection: an improved human mitochondrial molecular clock. American journal of human genetics 84, 740-759.

12. Jombart, T., Devillard, S., and Balloux, F. (2010). Discriminant analysis of principal components: a new method for the analysis of genetically structured populations. BMC genetics 11, 94.

13. Excoffier, L., and Foll, M. (2011). fastsimcoal: a continuous-time coalescent simulator of genomic diversity under arbitrarily complex evolutionary scenarios.

Bioinformatics 27, 1332-1334.

14. Beaumont, M.A., Zhang, W., and Balding, D.J. (2002). Approximate Bayesian computation in population genetics. Genetics 162, 2025-2035.

15. Gunnarsdottir, E.D., Li, M., Bauchet, M., Finstermeier, K., and Stoneking, M. (2011). High-throughput sequencing of complete human mtDNA genomes from the Philippines. Genome research 21, 1-11.

16. Delfin, F., Min-Shan Ko, A., Li, M., Gunnarsdottir, E.D., Tabbada, K.A., Salvador, J.M., Calacal, G.C., Sagum, M.S., Datar, F.A., Padilla, S.G., et al. (2013). Complete mtDNA genomes of Filipino ethnolinguistic groups: a melting pot of recent and ancient lineages in the Asia-Pacific region. European journal of human genetics : EJHG. 17. Jinam, T.A., Hong, L.C., Phipps, M.E., Stoneking, M., Ameen, M., Edo, J., and Saitou, N.

(2012). Evolutionary history of continental southeast Asians: "early train" hypothesis based on genetic analysis of mitochondrial and autosomal DNA data. Molecular biology and evolution 29, 3513-3527.

18. Gunnarsdottir, E.D., Nandineni, M.R., Li, M., Myles, S., Gil, D., Pakendorf, B., and

Stoneking, M. (2011). Larger mitochondrial DNA than Y-chromosome differences between matrilocal and patrilocal groups from Sumatra. Nature communications 2, 228.

19. Soares, P., Trejaut, J.A., Loo, J.H., Hill, C., Mormina, M., Lee, C.L., Chen, Y.M., Hudjashov, G., Forster, P., Macaulay, V., et al. (2008). Climate change and postglacial human dispersals in southeast Asia. Molecular biology and evolution 25, 1209-1218. 20. Friedlaender, J.S., Friedlaender, F.R., Hodgson, J.A., Stoltz, M., Koki, G., Horvat, G.,

Zhadanov, S., Schurr, T.G., and Merriwether, D.A. (2007). Melanesian mtDNA complexity. PloS one 2, e248.

(22)

21. Peng, M.S., Palanichamy, M.G., Yao, Y.G., Mitra, B., Cheng, Y.T., Zhao, M., Liu, J., Wang, H.W., Pan, H., Wang, W.Z., et al. (2011). Inland post-glacial dispersal in East Asia revealed by mitochondrial haplogroup M9a'b. BMC biology 9, 2.

22. Shapiro, B., Ho, S.Y., Drummond, A.J., Suchard, M.A., Pybus, O.G., and Rambaut, A. (2011). A Bayesian phylogenetic method to estimate unknown sequence ages. Molecular biology and evolution 28, 879-887.

23. Fu, Q., Mittnik, A., Johnson, P.L., Bos, K., Lari, M., Bollongino, R., Sun, C., Giemsch, L., Schmitz, R., Burger, J., et al. (2013). A revised timescale for human evolution based on ancient mitochondrial genomes. Current biology : CB 23, 553-559. 24. Excoffier, L. (2004). Patterns of DNA sequence diversity and genetic structure after a

range expansion: lessons from the infinite-island model. Molecular ecology 13, 853-864.

25. Kong, Q.P., Sun, C., Wang, H.W., Zhao, M., Wang, W.Z., Zhong, L., Hao, X.D., Pan, H., Wang, S.Y., Cheng, Y.T., et al. (2011). Large-scale mtDNA screening reveals a surprising matrilineal complexity in east Asia and its implications to the peopling of the region. Molecular biology and evolution 28, 513-522.

26. Hill, C., Soares, P., Mormina, M., Macaulay, V., Clarke, D., Blumbach, P.B., Vizuete-Forster, M., Vizuete-Forster, P., Bulbeck, D., Oppenheimer, S., et al. (2007). A mitochondrial stratigraphy for island southeast Asia. American journal of human genetics 80, 29-43.

27. Vilar, M.G., Chan, C.W., Santos, D.R., Lynch, D., Spathis, R., Garruto, R.M., and Lum, J.K. (2013). The origins and genetic distinctiveness of the Chamorros of the Marianas Islands: an mtDNA perspective. American journal of human biology : the official journal of the Human Biology Council 25, 116-122.

28. Tofanelli, S., Bertoncini, S., Castri, L., Luiselli, D., Calafell, F., Donati, G., and Paoli, G. (2009). On the origins and admixture of Malagasy: new evidence from high-resolution analyses of paternal and maternal lineages. Molecular biology and evolution 26, 2109-2124.

29. Ricaut, F.X., Razafindrazaka, H., Cox, M.P., Dugoujon, J.M., Guitard, E., Sambo, C., Mormina, M., Mirazon-Lahr, M., Ludes, B., and Crubezy, E. (2009). A new deep branch of eurasian mtDNA macrohaplogroup M reveals additional complexity regarding the settlement of Madagascar. BMC genomics 10, 605.

(23)

30. Delfin, F., Myles, S., Choi, Y., Hughes, D., Illek, R., van Oven, M., Pakendorf, B., Kayser, M., and Stoneking, M. (2012). Bridging near and remote Oceania: mtDNA and NRY variation in the Solomon Islands. Molecular biology and evolution 29, 545-564.

31. Tabbada, K.A., Trejaut, J., Loo, J.H., Chen, Y.M., Lin, M., Mirazon-Lahr, M., Kivisild, T., and De Ungria, M.C. (2010). Philippine mitochondrial DNA diversity: a populated viaduct between Taiwan and Indonesia? Molecular biology and evolution 27, 21-31.

32. Trejaut, J.A., Kivisild, T., Loo, J.H., Lee, C.L., He, C.L., Hsu, C.J., Lee, Z.Y., and Lin, M. (2005). Traces of archaic mitochondrial lineages persist in Austronesian-speaking Formosan populations. PLoS biology 3, e247.

33. Peng, M.S., Quang, H.H., Dang, K.P., Trieu, A.V., Wang, H.W., Yao, Y.G., Kong, Q.P., and Zhang, Y.P. (2010). Tracing the Austronesian footprint in Mainland Southeast Asia: a perspective from mitochondrial DNA. Molecular biology and evolution 27, 2417-2430.

34. Bellwood, P., Chambers, G., Ross, M., and Hung, H.-C. (2011). Are 'Cultures' Inherited? Multidisciplinary Perspectives on the Origins and Migrations of Austronesian-Speaking Peoples Prior 1000 BC. In Investigating Archaeological Cultures:

Material Culture, Variability, and Transmission, B. Roberts and M. Vander Linden, eds. (Springer), pp 321-353.

35. Qi, X., Cui, C., Peng, Y., Zhang, X., Yang, Z., Zhong, H., Zhang, H., Xiang, K., Cao, X., Wang, Y., et al. (2013). Genetic Evidence of Paleolithic Colonization and Neolithic

Expansion of Modern Humans on the Tibetan Plateau. Molecular biology and evolution.

36. Bellwood, P. (2011). The Checkered Prehistory of Rice Movement Southwards as a Domesticated Cereal—from the Yangzi to the Equator. Rice 4, 93-103.

37. Tsang, C.H. (2005). Recent discoveries at the Tapenkeng culture sites in Taiwan: Implications for the problem of Austronesian origins. In The Peopling of East Asia: Putting Together Archaeology, Linguistics and Genetics, R.B. L Sagart, A Sanchez-Mazas, ed. (Routledge/Curzon), pp 63-73.

38. Yang, X., Wan, Z., Perry, L., Lu, H., Wang, Q., Zhao, C., Li, J., Xie, F., Yu, J., Cui, T., et al. (2012). Early millet use in northern China. Proceedings of the National Academy of Sciences of the United States of America 109, 3726-3730.

(24)

39. Molina, J., Sikora, M., Garud, N., Flowers, J.M., Rubinstein, S., Reynolds, A., Huang, P., Jackson, S., Schaal, B.A., Bustamante, C.D., et al. (2011). Molecular evidence for a single evolutionary origin of domesticated rice. Proceedings of the National Academy of Sciences of the United States of America 108, 8351-8356. 40. Diamond, J., and Bellwood, P. (2003). Farmers and their languages: the first

expansions. Science 300, 597-603.

41. Rolett, B.V., Zheng, Z., and Yue, Y. (2011). Holocene sea-level change and the

emergence of Neolithic seafaring in the Fuzhou Basin (Fujian, China). Quaternary science reviews 30, 788-797.

42. Bellwood, P. (1978). Man’s Conquest of the Pacific: The Prehistory of Southeast Asia and Oceania.(Oxford Univ. Press, New York).

43. Bellwood, P. (1997). Prehistory of the Indo-Malaysian Archipelago.(Honolulu: University of Hawai'i Press.).

44. Li, P.J.K. (2008). Time perspective of Formosan Aborigines. In Past Human

Migrations in East Asia: Matching Archaeology, Linguistics and Genetics R.B. A Sanchez-Mazas, M Ross, I Peiros, M Lin, ed. (Routledge), pp 211-218.

45. Ross, M. (2009). Proto Austronesian verbal morphology: A reappraisal. In

Austronesian historical linguistics and culture history: a festschrift for Robert Blust, A.P. A Adelaar, ed. (Canberra: Pacific Linguistics), pp 295-326.

46. Rohland, N., Siedel, H., and Hofreiter, M. (2010). A rapid column-based ancient DNA extraction method for increased sample throughput. Molecular ecology resources 10, 677-683.

47. Meyer, M., Kircher, M., Gansauge, M.T., Li, H., Racimo, F., Mallick, S., Schraiber, J.G., Jay, F., Prufer, K., de Filippo, C., et al. (2012). A High-Coverage Genome Sequence from an Archaic Denisovan Individual. Science.

48. Meyer, M., and Kircher, M. (2010). Illumina sequencing library preparation for highly multiplexed target capture and sequencing. Cold Spring Harbor protocols 2010, pdb prot5448.

49. Dabney, J., and Meyer, M. (2012). Length and GC-biases during sequencing library amplification: a comparison of various polymerase-buffer systems with ancient and modern DNA sequencing libraries. BioTechniques 52, 87-94.

(25)

50. Kircher, M., Sawyer, S., and Meyer, M. (2012). Double indexing overcomes

inaccuracies in multiplex sequencing on the Illumina platform. Nucleic acids research 40, e3.

51. Margulies, M., Egholm, M., Altman, W.E., Attiya, S., Bader, J.S., Bemben, L.A., Berka, J., Braverman, M.S., Chen, Y.J., Chen, Z., et al. (2005). Genome sequencing in

microfabricated high-density picolitre reactors. Nature 437, 376-380.

52. Fu, Q., Meyer, M., Gao, X., Stenzel, U., Burbano, H.A., Kelso, J., and Paabo, S. (2013). DNA analysis of an early modern human from Tianyuan Cave, China. Proceedings of the National Academy of Sciences of the United States of America 110, 2223-2227.

53. Reich, D., Patterson, N., Kircher, M., Delfin, F., Nandineni, M.R., Pugach, I., Ko, A.M., Ko, Y.C., Jinam, T.A., Phipps, M.E., et al. (2011). Denisova admixture and the first modern human dispersals into Southeast Asia and Oceania. American journal of human genetics 89, 516-528.

54. Andrews, R.M., Kubacka, I., Chinnery, P.F., Lightowlers, R.N., Turnbull, D.M., and Howell, N. (1999). Reanalysis and revision of the Cambridge reference sequence for human mitochondrial DNA. Nature genetics 23, 147.

55. Briggs, A.W., Good, J.M., Green, R.E., Krause, J., Maricic, T., Stenzel, U., Lalueza-Fox, C., Rudan, P., Brajkovic, D., Kucan, Z., et al. (2009). Targeted retrieval and analysis of five Neandertal mtDNA genomes. Science 325, 318-321.

56. Reich, D., Green, R.E., Kircher, M., Krause, J., Patterson, N., Durand, E.Y., Viola, B., Briggs, A.W., Stenzel, U., Johnson, P.L., et al. (2010). Genetic history of an archaic hominin group from Denisova Cave in Siberia. Nature 468, 1053-1060.

57. Green, R.E., Malaspinas, A.-S., Krause, J., Briggs, A.W., Johnson, P.L.F., Uhler, C., Meyer, M., Good, J.M., Maricic, T., Stenzel, U., et al. (2008). A Complete Neandertal Mitochondrial Genome Sequence Determined by High-Throughput Sequencing. Cell 134, 416-426.

58. Edgar, R.C. (2004). MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic acids research 32, 1792-1797.

59. Kloss-Brandstatter, A., Pacher, D., Schonherr, S., Weissensteiner, H., Binna, R., Specht, G., and Kronenberg, F. (2011). HaploGrep: a fast and reliable algorithm for

automatic classification of mitochondrial DNA haplogroups. Human mutation 32, 25-32.

(26)

60. van Oven, M., and Kayser, M. (2009). Updated comprehensive phylogenetic tree of global human mitochondrial DNA variation. Human mutation 30, E386-394. 61. Wang, J.Y. (1967). Population Change of Formosan Aborigines. In Academia Sinica

Monographs. (Nankang, Taipei: Institute of Ethnology), p 36.

62. Gutenkunst, R.N., Hernandez, R.D., Williamson, S.H., and Bustamante, C.D. (2009). Inferring the joint demographic history of multiple populations from

multidimensional SNP frequency data. PLoS genetics 5, e1000695.

63. Neuenschwander, S., Largiader, C.R., Ray, N., Currat, M., Vonlanthen, P., and Excoffier, L. (2008). Colonization history of the Swiss Rhine basin by the bullhead (Cottus gobio): inference under a Bayesian spatially explicit framework. Molecular ecology 17, 757-772.

64. Felsenstein, J. (1978). The Number of Evolutionary Trees. Systematic Zoology 27, 27-33.

65. Lin, C.F., Wang, J.D., Chen, P.H., Chang, S.J., Yang, Y.H., and Ko, Y.C. (2006). Predictors of betel quid chewing behavior and cessation patterns in Taiwan aborigines. BMC public health 6, 271.

Acknowledgements

We thank all those who donated samples, and gratefully acknowledge Dr. Yang Shui Sheng (Magistrate of Lienchiang County), Prof. Tsang Cheng-Hwa, the Liangdao Archeological Team, and Ayinuer Aximu-Petri for lab assistance. This research was supported by the Max Planck Society and Lienchiang County.

(27)

Figure 1. The Liangdao Man

亮島人1號(M01)

Skeleton

• 出土於TP1探坑

Side and top profile of the ancient specimen found in a supine flexed position without coffin

(28)

Figure 2. Liangdao Man’s mtDNA haplogroup

M9

M

3027 3705 7598 13626 16390 4491 16362 489 10400 14783 15043

L3

R

N

12705 16223 8701 9540 10398 10873 15301 13254 14577 4248 10834

E1

E

E2

rCRS

H

263 750 1438 4769 8860 15326 73 2706 7028 11719 14766 195 8440 9080 15178 16051

There are 29/29 complete matches from rCRS to haplogroup E (based on Phylotree v15), and 2/4 defining positions towards E1 (13254C, 14577C), and 2 additional positions that are not haplogroup defining (315.1C, 16519C). The Liangdao Man (black star) is haplogroup E (and ancestral to E1)

(29)

Figure 32. Haplogroup E among Austronesian speakers related to the Liangdao Man 7706 14766 16223 7798 16172 16291 9512 9983 709 2220 16086 16263 55 56 195 5774 7498 15851 146 16140 16172 16248 16362 15313 16295 3849 8080 9699 16265 93 12714 131 8577 14560 6605 3345 143 1508 3995 9661 15001 16111 723 9861 8843 13934 3866 16176 11311 7418 10410 152 16218 16266 16311 11075 5471 8149 12723 373 198 1821 7364 207 9288 14767 16324 16256 5662 8551 13933 152 869 9192 16261 207 13722 374 7747 13269 16288 195 8440 9080 15178 16051 8730 7022 16215 485 9063 15777 5558 9293 9776 10721 198 5460 6734 9966 12723 16258 16309 5775 15287 12346 16086 16037 709 6929 593 9248 12952 246 12280 11710 10208 12397 16233 16287 16319 16445 55 4176 8494 12606 1047 215 16176 16261 4586 185 527 9010 16261 3866 13254 15235 16174 16291 16255 16311 M9 3027 3705 7598 13626 16390 E 153 3394 14308 16234 M9a 146 153 217 3447 4231 7142 8440 13194 13602 13989 14053 14687 15106 15289 15712 16311 7598 7600 9288 14569 16172 16240 13254 14577 4248 10834 E1 E1a E1a1 E2 E2a E2b1 6620 E1b E1a2 E1a1a 6340 E1a1a1 E1a1a2 4102 E1b1 E2b 16185 Aboriginal Formosan Philippines Indonesia Liangdao Man Melanesia Malaysia Southern China 3338

Shown are 67 haplotypes obtained from 104 whole mtDNA genome sequences under thebelonging to haplogroup E, collected from this study, and from published data from the Philippines, Malaysia, Indonesia and Melanesia; and two M9a sequences from Han in southern China.19 The Liangdao Man sequence

(black star) is an intermediate between E and E1. The Formosan sequences are most similar to the Liangdao Man sequence, where the closest sequences (red stars) only differ by four nucleotide changes, and these are found in 2 Ami (with differences at np 4248-6340-6620-10834) individuals under E1a, and 1

(30)

Atayal (with differences at np 4248-6620-10834-14766) individual under E1a1. The polymorphic positions are indicated on branches; recurrent variations mutations are underlined.

(31)

Figure 43. Into and Out of Taiwan Saisiat Atayal Tsou Bunun Puyuma Rukai Paiwan Ami Makatao Tao Hakka Minnan F r e q u e n c y (% ) 0 20 40 60 80 100 Saisiat Atayal Tsou Bunun Puyuma Rukai Paiwan Ami Makatao Tao Hakka Minnan C o lo r le g e n d f o r p r i v a te h a p l o ty p e s 0.0018 0.0020 0.0022 0.0024 Saisi at Atay al Tso uBunun Puyu ma Ruka i Paiwan Ami

N "

S"

D

B

Han Chinese Saisiat Atayal Tsou Bunun Ami Rukai Puyuma Paiwan Malayo-Polynesian 0 1 2 3 4 5 6 7 9 10 8 Divergence me(ka) South Central North Saisiat Atayal Tsou Bunun Puyuma Rukai Paiwan Ami Hakka Minnan DAPC1 DAPC2

C

A

Atayal Saisiat Tsou Bunun Paiwan Rukai Puyuma Ami

(A) Map of Formosan highlander groups and DAPC plot; individuals are dots, groups are circles, color corresponds to sampled locations in the map with two Han groups and two lowland groups in gray. (B) Stepping-stone haplotype sharing across 12 groups; Han/lowland Taiwanese groups are in gray, the frequency of haplotypes shared with/with other groups are colored according to the map. (C) Formosan nucleotide diversity declines from north (N) to south (S). (D) Formosan population relationships based on mtDNA sequences; dotted lines are non-Formosan groups.

(32)

Figure 54. Suggested migration route for Early Austronesian into and out of Taiwan, and the worldwide distribution of haplogroup E1 and Liangdao Man’s lineage across the Pacific

(33)

F u z h o u F o x t a i l m i l l e t d o m e s t i c a t i o n R i c e a n d F o x t a i l m i l l e t c u l t i v a t i o n 3 4 1 2 R i c e d o m e s t i c a t i o n L i a n g d a o M a n A B C h a m % M a la y sia ' B o rn e o & Su la w e si) B a li% Su m a tra '' A d m ira lty * Isla n d s* B ism a rcks) A rch ip e la go ) P N G $co a st$ A m i$ A ta y a l& B u n u n $ P a iw a n & P u y u m a & Sa isia t& T so u % E a st%T im o r% Flo re s' J a v a $ T ro b ria n d ) Isla n d s) M o lu cca s( N u sa % T e n gga ra % L i a n g d a o M a n O l d e s t h a p l o g r o u p E N o r t h & F o r m o s a n & C en tra l&

F o r m o s a n & So u t h & F o r m o s a n & P h ilip p in e s) G u a m % M a la ga sy ' F u z h o u L i a n g d a o M a n F o x t a i l m i l l e t d o m e s t i c a t i o n R i c e a n d F o x t a i l m i l l e t c u l t i v a t i o n 3 4 1 2 R i c e d o m e s t i c a t i o n L i a n g d a o M a n O l d e s t h a p l o g r o u p E A B

(34)

(A) Geographic regions in China of early foxtail millet domestication38 (shaded) delimited

by Nanzhuangtou, Cishan, and Yuezhuang,33 and of rice domestication39 (shaded) in the the

Yangtze River Valley.34 (1) Early Austronesian in the Fuzhou region, (2) Entry into north

Taiwan, (3) Rapid north-south dispersal along the western coast, and crop cultivation at Nanguanli,37 (4) One Austronesian language subgroup from Taiwan iwas ancestral to the

Proto-Malayo-Polynesian language subgroup in the Philippines. (B) World fFFrequency of haplogroup E1 across1 Taiwan, Madagascar,across Island Southeast Asia, 27and Near

數據

Figure 1. The Liangdao Man Skeleton 亮島人1號(M01)
Figure 2. Liangdao Man’s mtDNA haplogroup M9M 3027 3705 7598 13626 16390449116362489104001478315043L3RN127051622387019540103981087315301 13254 14577 4248 10834 E1 E E2rCRSH2637501438476988601532673270670281171914766195844090801517816051
Figure 32. Haplogroup E among Austronesian speakers related to the Liangdao Man 7706 1476616223779816172 162919512 9983 7092220160861626355561955774749815851146161401617216248 16362 1531316295 38498080 9699 162659312714 131 8577145606605 334514315083995966
Figure 43. Into and Out of Taiwan Saisiat Atayal Tsou Bunun Puyuma Rukai Paiwan Ami Makatao Tao Hakka Minnan F r e q u e n c y  (% )0204060 80 100 SaisiatAtayalTsouBunun PuyumaRukaiPaiwanAmi MakataoTao Hakka Minnan C o lo r  le g e n d  f o r   p r i v a t

參考文獻

相關文件

With regard to the spending structure, visitors from Mainland China spent 63% of their per- capita spending on shopping, whereas those from Hong Kong and Taiwan, China spent 79% and

With regard to spending structure, visitors from Mainland China spent 61% of their per- capita spending on shopping, whereas those from Hong Kong and Taiwan, China spent 78% and 71%

With regard to spending structure, visitors from Mainland China spent 66% of their per- capita spending on shopping, whereas those from Hong Kong and Taiwan, China spent 77% and 66%

• To enhance teachers’ knowledge and understanding about the learning and teaching of grammar in context through the use of various e-learning resources in the primary

vs Functional grammar (i.e. organising grammar items according to the communicative functions) at the discourse level2. “…a bridge between

HPM practice in Taiwan: A case study of HPM Tongxun (HPM Newsletter). These articles have documented the process of development and evolution of HPM practice in Taiwan as well

HPM practice in Taiwan: A case study of HPM Tongxun (HPM Newsletter). These articles have documented the process of development and evolution of HPM practice in Taiwan as well

www.edb.gov.hk&gt; School Administration and Management&gt; Financial Management &gt; Notes to School Finance&gt; References on Acceptance of Advantages and Donations by Schools