CHAPTER 2. Ancient Nuclear Plastid DNA in the Yew Family
2.4 Discussion
2.4.3 Nupts Are Molecular Footprints for Studying Plastomic Evolution
Although mutation rates are relatively low in plant organellar genomes, norgs can serve as “molecular fossils” for genomic rearrangements (Leister, 2005). Similarly, the Taxaceae nupts identified in this study retain the ancestral plastomic organization. In other words, nupts are footprints that are valuable in reconstructing the evolutionary history of plastomic organization and rearrangements.
Dating the age of nupts is critical for elucidating the evolution of nupts. For example, the estimated ages of Cep-2, Cep-5, and Cep-6 nupts are 15.3, 54.1, and 70.8 MY, respectively. Remarkably, these ages conflict with the scenario of plastomic rearrangements because the transfer of Cep-2 predated those of both Cep-5 and Cep-6
(Figure 4). Two plastomic forms derived from trnQ-IR-mediated homologous recombination coexist in an individual of C. oliveri (Yi et al., 2013). This trnQ-IR is also present in the plastome of C. wilsoniana as previously mentioned. We suspect that in C. wilsoniana, the younger Cep-2 nupt might originate from a transferred fragment of the trnQ-IR-mediated isomeric plastome.
Most importantly, nupts can also help in probing RNA-editing sites and improving gene annotations. Figure 12 clearly reveals that the previously annotated rps8 of T.
mairei (vouchers NN014, WC052, and SNJ046) is truncated. Our newly predicted
initial codon, “ACG”, locates 48 bp upstream of the previously predicted site. This“ACG” initial codon was predicted to be corrected to “AUG” via a C-to-U RNA-editing because the corresponding sequence of Tax-4 nupt and other conifers retain a normal initial codon of “ATG” (Figure 12). These data also imply that in T. mairei, the transfer of Tax-4 nupt predates the T-to-C mutation at the second codon position in the initial codon of rps8.
CHAPTER 3
Birth of Four Chimeric Plastid Gene Clusters in Sciadopitys verticillata
3.1 Introduction
Due to the loss of many genes in early endosymbiosis, plastomes are much reduced compared to their cyanobacterial counterparts (Ku et al., 2015). To date, plastomes have invariably retained a small handful of prokaryotic features, including the organization of genes into polycistronic transcription units resembling bacterial operons (Sugiura, 1992;
Wicke et al., 2011). A hallmark of seed plant plastomes is the presence of two 20- to 30-Kb IR (hereafter referred to as “typical IRs,” including IRA and IRB), which typically contain four ribosomal RNAs. However, a few exceptions have been reported.
For example, conifers—the largest gymnosperm group comprising cupressophytes and Pinaceae—have lost a typical IR copy from their plastomes (Raubeson and Jansen, 1992). Recent studies have further suggested that cupressophytes and Pinaceae might have lost different IR copies, with the former losing IRA and the latter losing IRB (Wu, Wang et al., 2011; Wu and Chaw, 2014).
Conifer plastomes are also characterized by extensive genomic rearrangements.
The plastome of Cryptomeria japonica—the first completed plastome of cupressophytes (Hirao et al., 2008)—experienced at least 12 inversions after its split from the basal gymnosperm clade, cycads, whose plastomes have remained virtually unchanged for 280 million years (Wu and Chaw, 2015). The co-existence of four different plastome forms among Pinaceae genera is associated with intra-plastomic recombination mediated by three specific types of short IRs (Wu, Lin et al., 2011). Furthermore,
Cephalotaxus oliveri (Cephalotaxaceae; Yi et al., 2013) and four Juniperus species
(Cupressaceae; Guo et al., 2014) harbor isomeric plastomes that deviate from each other by an inversion possibly triggered by a trnQ-containing IR (“trnQ-IR”). Although conifer plastomes are highly rearranged (Wu and Chaw, 2014), disruptions in their operons are rare. Until recently, only one case was reported in the plastome of Taxusmairei, in which the S10 operon (trnI-rpoA region) was disrupted into two separate
segments by a fragment of approximately 15 Kb (Hsu et al., 2014). However, the impact of such operon disruptions on plastid evolution remains poorly understood.The 25 published cupressophyte plastomes available on GenBank (Dec 2015) represent four of the five cupressophyte families. However, no complete plastome is available for Sciadopityaceae. As part of our continuing efforts to decipher the diversity and evolution of conifer plastomes, we have completed and elucidated the plastome sequence of Sciadopitys. We found that the plastome of Sciadopitys is characterized by several unusual features. For the first time, this study reports the unusual shuffling of operons that results in the re-organization of plastid genes into new chimeric gene clusters.
3.2 Materials and Methods 3.2.1 DNA Extraction
Approximately 2 grams of fresh leaves were collected from an individual of
Sciadopitys verticillata (voucher Chaw 1496) growing in the Floriculture Experiment
Center, Taipei, Taiwan. The voucher specimen was deposited in the Herbarium of Biodiversity Research Center, Academia Sinica, Taipei (HAST). Total DNA of the leaves was extracted with 2X CTAB buffers (Stewart and Via, 1993). The extracted DNA was qualified with a threshold of DNA concentration >300 ng/μl, 260/280 = 1.8–2.0 and 260/230 > 1.7.
3.2.2 Sequencing, Plastome Assembly, and Genome Annotation
Sequencing was conducted on an Illumina MiSeq Sequencing System (Illumina, San Diego, CA) in Yourgene Bioscience (New Taipei City, Taiwan) to yield 300-bp paired-end reads of approximately 4 Gb. De novo assembly of the Sciadopitys plastome was performed using CLC Genomics Workbench 4.9 (CLC Bio, Arhus, Denmark).
Plastid genes were predicted using DOGMA (Wyman et al., 2004) and tRNAscan-SE 1.21 (Schattner et al., 2005) with the default option that real tRNA genes should have ≥ 20 Cove scores. Boundaries of predicted genes were manually adjusted by aligning them with their orthologs of other gymnosperms. Sequences were aligned using MUSCLE (Edgar, 2004) implemented in MEGA 5.0 (Tamura et al., 2011).
3.2.3 Estimates of Dispersed Repeats and Plastomic Inversions
Repeat sequences were searched by comparing the plastome against itself using NCBI Blastn with the default settings, followed by manual deletion of overlapping or conjoined pairs. To assess the possible scenarios of plastomic inversions in Sciadopitys, the plastome of Cycas taitungensis (NC_009618) with its IRA removed was used for comparison. We identified the syntenic block of genes between Sciadopitys and Cycas using Mauve 2.3.1 (Darling et al., 2004). The resulting matrix of syntenic blocks was utilized to estimate the minimal inversion steps with MGR 2.0.1 (Bourque and Pevzner, 2002). The plastome map of Sciadopitys was drawn using Circos 0.67 (Krzywinski et al.,
2009).
3.2.4 Detection of Isomeric Plastomes
Primer pairs listed in Table 5 were used to amplify DNA fragments specific to the two isomeric plastomes in Sciadopitys (i.e., rpl33 + rpoC2 and rpoC1 + rps18 for the presence of the A form; rpl33 + rps18 and rpoC1 + rpoC2 for the B form). PCR reactions were conducted with three different numbers of cycles. The conditions were 94℃ for 5 min, followed by 25, 30, or 35 cycles of 94℃ for 20 s, 55℃ for 20 s, and 72℃ for 2 min, and an extension of 72℃ for 10 min.
3.2.5 Detection of RNA Transcripts in Chimeric Gene Clusters
Total RNA was extracted from fresh leaves of Sciadopitys according to a modified RNA isolation protocol (Kolosova et al., 2004). We employed a RevertAid H Minus First Strand cDNA Synthesis Kit (Thermo Fisher Scientific, Waltham) to synthesize the first strand cDNA with four specific primers (SpsbNSrpoC1-3, SatpASrpl33-2, SrpoC2SpsbB-1, and Srps18 in Table 5). PCR reactions were conducted with the synthesized cDNA and four pairs of specific primers (atpF-1 + psbT for a 358-bp fragment; psbB-2 + atpA-2 for a 565-bp fragment; rpl33 + rpoC2 for a 687-bp fragment;
rpoC1 + rps18 for a 939-bp fragment). The PCR conditions were 94℃ for 5 min, followed by 30 cycles at 94℃ for 20 s, 60℃ for 20 s, and 72℃ for 1 min, and an extension at 72℃ for 10 min.
3.3 Results and Discussion
3.3.1 Loss of IRA from S. verticillata Plastome
The plastome of Sciadopitys verticillata (AP017299) is illustrated as a circular
molecule with size of 138,309 bp (Figure 14). It is the largest among the known plastomes of Cupressales (including Sciadopityaceae, Taxaceae s. l., and Cupressaceae s.
l.). Flanking and adjacent genes of the typical IRs are informative markers for inferring
the intact (or retained) IR copy in conifer plastomes. For example, the boundary of IRAor IRB is adjacent to the psbA or S10 operon (i.e., trnI-rpoA region), respectively (Wu, Wang et al., 2011). In the Sciadopitys plastome, the retained typical IR copy, which encompasses the region from trnN-GUU to ycf2, is adjacent to the S10 operon (Figure 14), indicating that it should be IRB. In other words, the lost IR copy is IRA. This observation reinforces the hypothesis that cupressophytes have lost IRA rather than IRB
(Wu, Wang et al., 2011; Wu and Chaw, 2014).
The plastome of Sciadopitys contains a total of 121 genes, 83 of which are protein-coding genes and the rest are structural RNA genes (Table 6). Sixteen genes contain introns, but the intron of rpoC1 has been lost. Remarkably, each of the three genes, rrn5, trnI-CAU, and trnQ-UUG, has two copies. The duplicated rrn5 is located in the region between psbN and psbT (Figure 14). Duplicated rrn5 has previously reported in the plastomes of Agathis dammara and Wollemia nobilis (Araucariaceae), but it was located in between psbB and clpP (Yap et al., 2015). Among the elucidated cupressophyte plastomes available in GenBank, only these three taxa contain two copies of plastid rrn5. However, since the locations of their extra rrn5 differ among these two cupressophytes families, it is most parsimonious that duplications of rrn5 occurred independently. Our data also show that accD was lost in the plastome of Sciadopitys. Li et al. (2016) had previously reported that the loss of accD occurred after the split of Sciadopityaceae from other cupressophytes and that the accD has been functionally transferred from plastid to nucleus.
3.3.2 Pseudogenization of Four tRNA Genes after Tandem Duplications
Notably, four pseudo-tRNA genes (
ΨtrnV-GAC, ΨtrnQ-UUG, and two copies of ΨtrnP-GGG) were detected in the Sciadopitys plastome (Figure 14). Both ΨtrnV-GAC
andΨtrnQ-UUG are close to their functional paralogs, implying that pseudogenization
of these two genes might have occurred after tandem duplications. The plastidtrnP-GGG of angiosperms likely has been lost for 150 MY (Chaw et al., 2004). In
contrast, this tRNA gene is retained and commonly located in the region between trnL and rpl32 in Cycas, Ginkgo, Gnetum, Pinus (Wu et al., 2007), and other cupressophyte families, such as Araucariaceae (Yap et al., 2015), Podocarpaceae (Vieira Ldo et al., 2014; Wu and Chaw, 2014), and Taxaceae s. l. (Yi et al., 2013; Hsu et al., 2014). In theSciadopitys plastome, two ΨtrnP-GGG copies are separated by a distance of
approximately 20-Kb; one is adjacent to trnL-UAG and the other is located near rpl32 (Figure 14). Therefore, the twoΨtrnP-GGG copies might have resulted from tandem
duplications, followed by subsequent 20-Kb plastomic inversion.3.3.3 Evolution of Plastid trnI-CAU Genes in S. verticillata
Sciadopitys has two copies of plastid trnI-CAU: one located in between trnC-GCA
and psbA, while the other is between ycf2 and rpl23 (Figure 14 and Figure 15). InCryptomeria, one of the two plastid trnI-CAU copies was considered residual from the
lost typical IR (Hirao et al., 2008). Indeed, the majority of cupressophyte plastomes have two trnI-CAU copies with sequence identity higher than 85% (Table 7), connoting their homologous origin.In Sciadopitys plastome, both trnI-CAU are capable to fold into cloverleaf structures, but they differ in prediction scores. The copy that is located in between ycf2 and rpl23 has a score of 78.1 bits (Figure 15A), much higher than the score of the other
copy (score = 48.5 bits; Figure 15B). Ten nucleotide substitutions (highlighted in gray in Figure 15) were detected among the two trnI-CAU copies, including three mismatches and four U•G abnormal pairings in the stems of the low-scoring trnI-CAU (Figure 15B). Although trnI-CAU is essential for plastid biology (Alkatib et al., 2012), the presence of two copies of trnI-CAU in Sciadopitys plastome remain to be investigated. Interestingly, the elucidated cupressophyte plastomes, such as
Cephalotaxus, Nageia, and Podocarpus, contain only one copy of trnI-CAU (Table 7).
Therefore, whether the low-scoring trnI-CAU of Sciadopitys is functionally redundant and subjected to relaxed structural constraint is worthy of further investigation.
3.3.4 Presence of Two Isomeric Plastomes in S. verticillata
Recent studies of conifer plastomes revealed that dispersed short IRs can trigger plastomic rearrangements to generate isomeric forms. In Pinaceae, a shift between different plastomic forms is often associated with homologous recombination (HR) mediated by the short IRs of approximately 949 bp (Tsumura et al., 2000; Wu, Lin et al., 2011). The short IRs that contain trnQ-UUG (trnQ-IR) can also promote the formation of isomeric plastomes in Cephalotaxus (Yi et al., 2013) and Juniperus (Guo et al., 2014).
Thirty-seven pairs of dispersed repeats were detected in the Sciadopitys plastome.
Among them, the longest IR pair is 370 bp and contains the sequences of 3’rpoC1 and
5’rpoC2 (Figure 14). In the Sciadopitys plastome, only this 370-bp IR pair is longer
than the 250-bp “trnQ-IR” of Juniperus (Guo et al., 2014). Hence, if the 370-bp IR were able to mediate HR in Sciadopitys, we would expect the presence of two plastomic forms, as depicted in Figure 16. We designate the plastomic form illustrated in Figure 14 as the A form, while the other is the B form. We have verified the presence of boththe A and B forms by amplicons of four specific DNA fragments across the 370-bp IR in the PCR with 35 cycles (Figure 16). However, the amount of the PCR amplicons differs between the two forms. With 25 PCR cycles, only the two specific amplicons of the A form are evident, whereas those of the B form are undetectable (Figure 16). These results suggest that A form is the predominant form of Sciadopitys plastome populations, in agreement with our assembly results.
The plastomes of Cupressaceae and Taxaceae possess two copies of trnQ-IRs (Guo et al., 2014). Nonetheless, this IR is absent from the plastome of Sciadopitys. The plastomes of Araucariaceae (Yap et al., 2015) have an IR pair that is approximately 600-bp long and contains the gene rrn5. Such the length of IRs could potentially trigger HR. However, the presence of associated isomeric plastomes has not been experimentally demonstrated in Araucariaceae. Including the unique 370-bp IR of
Sciadopitys, it is apparent that in cupressophytes, the presence of isomeric plastomes is
overwhelming and associated with diverse short IRs.
3.3.5 Birth of Four Chimeric Gene Clusters
We identified a total of 16 syntenic blocks between Cycas and Sciadopitys plastomes (Figure 14; Figure 17). In addition to the loss of IRA from the Sciadopitys plastome mentioned above, eight plastomic inversions were detected to distinguish
Sciadopitys from Cycas (Figure 17). Since Cycas was proposed to retain the ancestral
gene order of seed plant plastomes (Jansen and Ruhlman, 2012), these eight inversions should have occurred after cupressophytes split from cycads.In Sciadopitys, plastomic inversions have also disrupted four typical operons that are generally conserved among seed plants. These disrupted operons are rps2–
atpI
–atpH
–atpF
–atpA (hereafter, rps2 operon), psbB
–psbT
–psbH
–petB
–petD (psbB operon),
rpoB
–proC1
–rpoC2 (rpoB operon), and petL
–petG
–psaJ
–rpl33
–rps18 (petL operon)
(Figure 18A & B). Recombination between rps2 and psbB operons is associated with inversion 8 (Figure 17), creating the rps2–petD and psbB
–atpA gene clusters (Figure
18A). On the other hand, inversion 4 (Figure 17) recombined the rpoB and petL operons and then generated the petL–rpoC2 and rpoB
–rps18 gene clusters (Figure 18B). Most
genes in each of the four chimeric gene clusters have the same transcriptional direction (Figure 18A & B). Therefore, we postulated that genes in these chimeric gene clusters might be co-transcribed. We performed RT-PCR assays with specific primers designed from genes near the junction between different operon-derived segments to verify this proposition.As shown in Figure 18C, our RT-PCR results indicate that (1) there was no DNA contamination in the assayed RNA because all of the negative controls failed to yield any signal; and (2) the expected size of amplicons was clearly detected in all experimental sets. These data suggest that shuffling between different operons could lead to the birth of new co-transcription units in plastids.
Disruptions of conserved plastid operons have been only reported in a few taxa, such as Vigna (Perry et al., 2002), Trifolium (Cai et al., 2008), Trachelium (Haberle et al., 2008), some genera of Geraniaceae (Guisinger et al., 2011), Taxus (Hsu et al., 2014), and Sciadopitys (this study). Of note, these taxa also have highly rearranged plastomes.
Except for Vigna and Sciadopitys, none of the above taxa has experienced recombination between operons. Instead, their disrupted operons are separated rather than combined. In the Vigna plastome, recombination between two homologous operons;
S10A and S10B, has led to the re-organization of genes in the operons (Perry et al., 2002). Nonetheless, novel chimeric gene clusters created by shuffling between heterologous operons (Figure 18) are documented for the first time in the present study.
3.3.6 Evolutionary Effects of Novel Chimeric Gene Clusters
The chimeric gene clusters of Sciadopitys provide two novel insights into the evolution of plastomes. First, other than the gene cluster rpoB–
rpoC1
–rps18, the
remaining three chimeric gene clusters do not alter their upstream regions, as the neighboring genes of their 5’ regions are the same as those of Cycas (Figure 18A & B).This finding suggests that the promoter sequences of these gene clusters have not been altered after the associated inversions taken place. Figure 18D shows that the upstream sequence of rpoB harbors a YRTA motif of the nuclear-encoded RNA polymerase (NEP) promoters (Shiina et al. 2005). Furthermore, genes of different origins are able to be co-transcribed in the chimeric gene cluster (Fig. 18C). Therefore, we cannot rule out the possibility that the pre-existing promoters are adopted for transcription of the genes in these chimeric gene clusters.
Second, shuffling between rpoB and petL operons (Figure 18B) has relocated
rpoC2 to join the segment of the 5’petL operon whose transcription are associated with
the plastid RNA polymerases (PEP) promoter (Finster et al., 2013). RpoC2 codes for one of the core units of PEP (Hu and Bogorad, 1990). If the chimeric gene cluster petL–petG
–psaJ
–rpl33
–rpoC2 is exclusively transcribed by PEP, we would not expect any
transcript of this gene cluster in Sciadopitys. Nonetheless, its associated transcript was observed in Figure 18C. Two possibilities might account for the presence of this transcript. First, the isomeric plastome of the B form (Figure 16) that contains an intactrpoB operon provides RPOC2 proteins. Second, an alternative promoter has evolved to
perform transcription because many plastid genes are transcribed by both the NEP and PEP promoters (Börner et al., 2015). Third, this transcript may correspond to a read-through transcript which allows the expressions of downstream of 3’ untranslated regions (Quesada-Vargas et al., 2005).Notably, functionally unrelated genes have been joined together in these chimeric gene clusters. For example, the four photosynthetic genes (psbT, psbH, petB, and petD) of the psbB operon have been relocated and joined with the segment of the 5’rps2 operon whose promoter is of the NEP type (Kapoor and Sugiura, 1999). This suggests that the four photosynthetic genes are not transcribed by PEP, in disagreement with a partition of labor in which PEP transcribes genes associated with photosynthesis and NEP transcribes housekeeping genes (Hajdukiewicz et al., 1997). In contrast, this finding agrees with Liere and Börner (2007) that most of plastid genes can be transcribed by either NEP or PEP.
CHAPTER 4 Conclusions
Plastomes of cupressophytes are highly variable in their size, genome organization, and gene content. These features provide a unique opportunity to study their evolution.
In this study, we sequenced three complete plastomes, A. formosana, T. mairei, and S.
verticillata. By comparing the three newly sequenced plastomes with published
cupressophyte plastomes, we are able to obtain new insights into the plastomic evolution of cupressophytes.We have shown that plastomic rearrangement events provide useful information for amplifying nupts in Chapter 2. Because it is difficult to avoid the amplification of isomeric plastomic or mitochondrial DNA, examining the origins of PCR amplicons was a prerequisite in this proposed PCR-based study. In angiosperms such as Nicotiana,
nupts were experimentally demonstrated to be eliminated quickly from the nuclear
genome (Sheppard and Timmis, 2009). However, we show that the oldest conifer nupt has been retained for at least 70.8 MY (i.e., since the late Cretaceous period). With an increase of available plastomes in conifers, comparative genomic analyses are expected to reveal more plastomic rearrangements. Using our approach, we are beginning to understand the evolution of nupts in diverse conifer species without the need to sequence and assemble their huge nuclear genomes.In Chapter 3, we have shown that plastomic rearrangement events in Sciadopitys provide a unique opportunity to understand the evolutionary impact of plastomic rearrangements. The plastome of Sciadopitys is characterized by several unusual features, such as the loss of the typical IRA copy, the duplication and pseudogenization
of four tRNAs, extensive genomic inversions, the presence of isomeric plastomes, and chimeric gene clusters derived from shuffling of remote operons. All these characteristics highlight the fact that the evolution of plastomes may be more complex than previously thought. The highly rearranged plastome of Scidaopitys advances our understanding of the dynamics, complexity, and evolution of plastomes in conifers.
CHAPTER 5 Future Prospectives
In the first project, we proposed a PCR-based strategy to identify nupts. Although DNA transfer from plastomes to the nuclear genome is highly frequent, it is very rare to observe a functional organelle gene in the nuclear genome (Lloyd and Timmis, 2011).
Most nupts are quickly deleted, decays, or alternatively scrapped during the plant evolution (Lloyd and Timmis, 2011). A nupt must acquire additional genetic elements if it is functional and can be retained in its new environment. The functional nupts should include at least three features. First, their DNA sequences should contain an intact open reading frame and could be transcribed correctly. Second, they should acquire a specific nuclear promoter and this promoter can regulate the nupt transcription during the plant development. Third, they should obtain a transit peptide of plastome-target to import plastome-specific proteins back to plastomes. It would be of interest to study if these newly identified nupts in gymnosperms have developed any novel functions.
Our study presented new insights into the plastome arrangements and intracellular gene transfer in non-model systems. After the completion of Sciadopitys plastome, at least one representative species for each gymnosperm families have been published.
These plastomes provide an opportunity to systematically examine the plastid DNA evolution and to model plastomic orientation changes in gymnosperms. With the advent of new sequencing and bioinformatic technologies, such large scale systematic plastomic studies would be possible in near future, enabling a new era of the comparative genomics of organellar evolution.
FIGURES
Figure 1
The phylogenetic tree of endosymbiotic evolution. (Image adapted from Timmis et al.,
The phylogenetic tree of endosymbiotic evolution. (Image adapted from Timmis et al.,