• 沒有找到結果。

Tool and web servers for analyzing NGS small RNA sequencing data

3. Related works

3.3 Tool and web servers for analyzing NGS small RNA sequencing data

miRDeep

miRDeep [8] is the first stand-alone package which is designed to identify novel miRNAs from sequencing data generated by next-generation sequencing technology. miRDeep first aligns the sequencing reads with reference genomes.

Then, the reads which have multiple genomic loci or mapped to rRNA, tRNAs, scRNA, snRNA and snoRNA are removed. Remained reads are used for finding potential miRNA precursors. Probabilistic scoring systems are applied to each potential precursor. miRDeep can be downloaded at http://www.mdc-berlinde/rajewsky/miRDeep and be executed on the standard Linux machine .

deepBase

deepBase [149] is a database which collected 185 NGS small RNA sequencing data in seven organisms (Homo sapiens, Mus musculus, Gallus gallus, Ciona intestinallis, Drosophila melanogaster, Caenhorhabditis elegans and Arabidopsis thaliana). The type of small RNAs which were annotated are nasRNA (ncRNA-associated small RNA), pasRNA (promoter-associated small RNA), easRNA (exon-associated RNA), rasRNA (repeat-associated), miRNA and snoRNA.

deepBase also provided the interactive web interface (http://deepbase.sysu.edu.cn) for researchers quickly viewing the annotated small RNA sequencing data .

26

Geoseq

Geoseq [150] (http://geoseq.mssm.edu ) collects deep-sequencing data from various public repositories like GEO (Gene Expression Omnibus) and SRA (Sequence Read Archive) from NCBI and preprocessed these data. The method of Geoseq for dealing with the sequencing data is different with previous studies. It maps the reference sequence against sequencing data instead of mapping them to reference genomes or sequences. Researchers can analyze their own sequencing data against the processing data in Geoseq for identifying differential isoform expression in mRNA-seq datasets and identifying known and novel miRNAs in miRNA datasets.

miRanalyzer

miRanalyzer [11] (http://web.bioinformatics.cicbiogune.es/microRNA/ ) is the first web server tool for analyzing the next-generation sequencing data in small RNAs. Before uploading the sequencing data, the researchers need to merge the same reads to a unique one and counting their copy numbers (expression level).

After running the datasets at miRanalyzer web server, researchers can obtain the analyzed results such as the expression level of all known miRNAs in miRBase, predicted novel miRNA lists and all sequencing reads which can be mapped to transcribed sequences (mRNA, ncRNA and rasiRNA). miRanalyzer also provides the target gene lists for all detected miRNAs by using two miRNA target site prediction tools (miRanda and TargetScan).

27

SeqBuster

SeqBuster [12] (http://estivill_lav.crg.es/seqbuster) is the web-based toolkit to deal with and analyze high-throughput sequencing small RNA datasets. It also offers the stand-alone version to overcome the storage capacity limitations of the web-based tool. It provides raw data preprocessing, miRNA profiling, the analysis of miRNAs variability (IsomiRs), differentially expressed miRNAs discovery and miRNA target sites prediction. For differentially expressed miRNAs discovery, the equation for normalizing the expression level is n = (freq n/sum [freq all seqs]) × scale-value. SeqBuster is the first tool offering the analysis of IsomiRs. For each detected miRNA, it provides the analysis results in 5’ end and 3’end trimming, 5’

end and 3’end adding and nt-substitution.

mirTools

mirTools [14] (http://centre.bioinformatics.zj.cn/mirtools/ ) is the web server which allow researchers to do comprehensive analysis through uploading their high-throughput sequencing small RNA data. In mirTools, researchers can process their raw sequencing data, explore the length distribution of reads, classify reads into different categories such as known miRNAs, snoRNA, rasiRNA and coding sequences, identify novel miRNAs, discovery differentially expressed miRNAs between different samples and predict the target genes of miRNAs by using miRanda and RNAhybrid. mirTools also provide the function analysis of miRNA target genes by investigating them in Gene Onotology terms and pathways .

28

DSAP

DSAP [13] (http://dsap.cgu.edu.tw ) is the web server which is designed for analyzing small RNA datasets produced by next-generation sequencing technology. Researchers need to prepare their datasets as a tab-delimited format which contains the unique reads and their expression level. The system flow of DSAP is different with other web tools for analyzing small RNA sequencing data.

Other web tools align reads with reference genomes first. Then, they compare the chromosome location of reads with known miRNA or ncRNA location. DSAP directly align reads with miRNA precursors from miRBase and ncRNAs from Rfam. After mapping reads, the information of cross-species distribution of detected miRNAs is provided. DSAP also provides two or three sample comparison.

miRExpress

miRExpress [9] is the first stand-alone package which is designed for monitoring the miRNA expression and identifying novel miRNAs in small RNA datasets generated by next-generation sequencing technology. miRExpress combine the known miRNAs from miRBase (users do not handle the miRNAs information by themselves). Researchers can use miRExpress to preprocess their raw sequencing data, observe the length distribution of proceeded reads, monitor miRNA expression through user defined parameters, identify novel miRNA by cross-species miRNAs comparing. miRExpress also provides the alignments between detected miRNAs and reads. Users can find nucleotide modification from this report. miRExprees can be downloaded at http://mirexpress.mbc.nctu.edu.tw/ and be executed on x86 Linux 32 or 64 bit system.

29

miRNAkey

miRNAkey [10] is a package designed to analyze high-throughput sequencing miRNA data. The system flow of miRNAkey contains serval steps. First is trimming 3’ adaptor sequence from 3’ end of the reads. Second is mapping the reads to known miRNAs. Thrid is counting the expression level of mapped miRNAs and converting the expression into the normalized RPKM expression index (reads per kilobase prer million mapped reads). Fourth is identifying differentially expressed miRNAs by chi-squared analysis. Final is producing the additional information according to the input data, such as multiple mapping levels and post-clipping read lengths. The important improvement is that miRNAkey developed SEQ-EM algorithm for solve the multiple-aligned-reads problems in the detected miRNAs. miRNAkey is freely available for downloading at (http://ibis.tau.ac.il/miRNAkey ).

30