• 沒有找到結果。

3.1 HGT scan

To find HGT genes in eukaryotes, I scan Aspergillus fumigatus whole genome protein sequences by using BLASTp. I got a blast output file that 9630 Aspergillus fumigatus protein sequences as queries against NCBI nr database individually, and the

more similar sequences are recorded in the file by the e-value. First, I ran a PHP pipeline for automatic filtration to find out HGT candidate genes, the cutoff e-value were less than 10-5 and the best hit were distant species except the genus Aspergillus. I got 57 candidate genes of Aspergillus fumigatus and the candidate genes were listed in Table 1. Second, I constructed phylogenetic trees for each candidate gene and analyzed the incongruence of the taxon distribution. There are seven phylogenies that looked like horizontal transfer events patterns. Not all genomes are sequenced and annotated, so losing some gene data may result in false horizontal transfer phylogenetic patterns. To exclude the influences of sequencing data incompletion, I do double check by running BLASTn against genome sequence data. I use HGT nucleotide sequences as queries against to sequenced genome and all nucleotide sequences to check if there are other closer sequences hit to genome sequences than the HGT candidate genes. Through BLASTn check, I get one HGT gene,

9

AFUA_5G10930, transferred from fungi to Stigmatella aurantiaca (Figure 1). In the phylogenetic tree of gene AFUA_5G10930, the orthologous gene of Stigmatella aurantiaca, STAUR_2131, group with the orthologs of Aspergillus clavatus,

Aspergillus fumigatus, and Neosartorya fischeri in the same phylogenetic clade (Node

A), that distributed in a fungi clade. From this phylogeny, we couldn’t precisely infer which species the HGT gene transferred from. It is possible that the HGT donor gene hadn’t been sequenced yet.

3.2 GC content

Stigmatella aurantiaca genome sequence contained 67% G+C nucleotides. I

compare the genes that next to the transferred gene, STAUR_2131, in Stigmatella aurantiaca genome. The GC content of genes from STAUR_2124 to STAUR_2141

and Stigmatella aurantiaca complete genome contain more than 62% except the HGT gene STAUR_2131. The GC content of the transferred gene is lower than other Stigmatella aurantiaca genes but close to the orthologous gene ACLA_013950.

Moreover, I compare transferred gene STAUR_2131 and the foreign genes STAUR_2130, STAUR_2132 and STAUR_2133 with the closest fungi genome, Aspergillus clavatus, Aspergillus fumigatus, and Neosartorya fischeri (Figure 3). In

contrast to the GC content of Stigmatella aurantiaca genes and genome, the GC

10

content of fungi are lower than 62%. In Figure 3, the GC content values of HGT genes are between the values of Stigmatella aurantiaca and fungi. According to the phylogeny of HGT orthologous gene, I compared the GC content for the coding positions 1 + 2 and position 3 (wobble site), including the GC content of the orthologs of HGT gene, neighbor genes and ribosomal protein genes. There are no obvious differences of GC frequencies for positions 1 + 2 among the species in the phylogeny, but the GC frequencies of position 3 are much higher than positions 1 + 2 within the HGT gene clade (Node A). To summarize the data of GC content, the GC content of positions 1 + 2 of STAUR_2131 is less than the neighbor genes, but the GC content of the position 3 is very high. Mutations happen in the wobble sites of amino acid codes that often contribute to synonymous mutations are more common than positions 1 + 2.

When one gene is transferred to another genome, the codon usage and the GC content would become to the recipient genome gradually. The lower GC content of positions 1 + 2 makes the GC content of STAUR_2131 lower than other neighbor genes in Stigmatella aurantiaca, so the GC content of STAUR_2131 is between the GC

content of fungi and Stigmatella aurantiaca. Figure 4 displays that the GC content level of positions 1 + 2 and wobble sites of Stigmatella aurantiaca are similar to the level of species in the clade of Aspergillus clavatus. The nucleotide sequence synonymous distance between HGT gene of Stigmatella aurantiaca and the orthologs

11

of the Aspergillus clavatus clade are smaller than the orthologs of the Penicillium marneiffei clade (Figure 5). The group of orthologs that contain the same pattern of

the GC content level have smaller synonymous distances between the transferred gene of Stigmatella aurantiaca; the orthologs of Penicillium marneffei and Talaromyces stipitatus contain lower GC content of wobble sites have bigger distances between the

transferred gene. Then, I scanned the enlarged sequence fragment of the transferred gene by 150 nucleotides window (Figure 6). The GC contents of wobble sites in protein coding region are larger than the GC contents along the genome sequence.

Besides, the GC contents in the intergenic region are less than the average GC contents of protein coding region. To exam whether the sequence under positive selection make the GC contents of protein coding region higher than intergenic region to fit Stigmatella aurantiaca genome, I scan the GC content of intergenic regions by dividing the sequences into three group in order, site A, site B, and site C. The GC content of the three site groups are similar with the total GC content of the intergenic region between gene STAUR_2132 and STAUR_2133. The coherence of GC content of the three sites suggest that the protein coding regions are under positive selection, so the wobble sites in protein coding region tend to mutate to G or C to fit Stigmatella aurantiaca translational characteristic.

12

3.3 Codon usage bias

To infer the origin of the HGT gene, I compare the codon usage of HGT gene with the orthologs and the neighbor genes to the HGT gene. In figure 8 first row, the leucine usage bias of HGT gene is different from the neighbor genes, ribosomal protein, and genome of Stigmatella aurantiaca. The Pro and Phe codon usage of STAUR_2131 is different from the codon usage of Stigmatella aurantiaca genome and the fungal orthologs and genomes (Figure 8-9). The Leu, Val, Ile, and Ser codon usage bias of HGT gene is the same with the orthologous gene of Aspergillus clavutus and similar with Aspergillus fumigatus but not consistent with the usage of both the two fungal genome (Figure 10-13). The differences of the codon usage bias with the Aspergillus clavatus genome indicate the transferred gene is not belonging to

Aspergillus clavatus originally.

13

相關文件