• 沒有找到結果。

5.1 結論

本系統建立了一種新的描述語言來描述核醣核酸的二級結構,利用現有的二 級結構預測工具提供能量資訊,使用基因規劃法來來尋找一群具有相同功能核醣 核酸的共同結構元,不需要有序列排比的資訊,直接以二級結構作為搜尋的目 標,降低系統搜尋空間。

而系統的描述語言以圖形表示,對於一條莖幹的兩股間距離很長時,本系統 亦有可能找到此莖幹,但其表現能力尚不強,系統仍較為偏好距離較近的莖幹。

若要強化此功能,則容易影響到搜尋一般的莖幹,這在需要參數設定上做選擇或 調整。

我們使用基因規劃法搜尋核醣核酸的共同結構元,由實驗結果可以看出基因 規劃法能避開大量雜訊的干擾,而在不會花費太多時間的情況下尚能找到還不錯 的結果。

5.2 未來研究方向

在進行研究的過程中,發現尚有一些研究方向可以延伸。

5.2.1 處理共同結構元莖幹數較多之家族

在我們的實驗中,對於共同結構元莖幹數較多的 RNaseP(high)資料,我們可 以找出其真實共同結構元的子結構,經由簡單的測試,我們使用 FGGP 找出預測 共同結構元後,先在每條序列上移除此共同結構元,然後再重複呼叫 FGGP,如 此重複,可以漸漸提高 Matthews 相關係數。但是重複進行 FGGP,會漸漸增加雜 訊,之後則使預測能力愈來愈差。

因此我們可以研究如何設定一個機制,判斷該重複呼叫幾次才是最好的,以 及每次重複進行 FGGP 前該如何去掉多餘的雜訊,移除預測共同結構元後該如何 處理,還有重複呼叫亦可以改為遞迴式的呼叫等,這些還有很多的研究空間,。

5.2.2 對多個家族做分群

我們以兩個家族混合資料丟入 FGGP 測試,當兩個家族結構元差異性很大 時,有可能將一個家族的大部份序列取出來,因而達到了分群的目的。當家族的 結構元差異性不是很大時分群的效果就不是很明顯,但可以調整某些參數可以使 此能力稍微加強,因此我們可以朝著分群的方向繼續研究。

第六章、參考文獻

Baterburg F.H.D. van, Gultyaev A.P., Pleij C.W.A., Ng J. and Oliehoek J. “Pseudobase: a database with RNA pseudoknots." Nucleic. Acids Res. 2000, 28(1):201-204.

Brown JW. “The Ribonuclease P Database." Nucleic Acids Res. 1999, 27(1):314.

Ding Y, Lawrence C “A statistical sampling algorithm for RNA secondary structure prediction." Nucleic Acids Research 2003, 31(24):7280-7301.

Eddy,S.R. and Durbin, R “RNA sequence analysis using covariance models." Nucleic Acids Res. 1994, 22, 2079-25088.

Fera D, Kim N, Shiffedldrim N,Zorn J, Laserson U, Gan HH, Schlick T. “RAG:

RNA-As-Graphs web resources." BMC Bioinformatics. 2004, 5(1):88.

Gan HH, Fera D, Zorn J, Shiffedldrim N, Tang M, Laserson U, Kim N, Schlick T. “RAG:

RNA-As-Graphs Database – Concepts, Analysis, and Features." Bioinformatics 2004, 20:1285-1291.

Gardner P.P. and Giegerich R “A comprehensive comparison of comparative RNA structure prediction approaches." BMC Bioinformatics 2004, 5:140

Gorodkin J, Heyer L, Stormo G “Finding the most significant common sequence and structure motifs in a set of RNA sequences." Nucleic Acids Research 1997, 25(18):3724-3732.

Griffiths-Jones S, Bateman A, Marshall M, Khanna A, Eddy SR “Rfam: an RNA family database." Nucleic Acids Research 2003, 31:439-441.

Griffiths-Jones S, Moxon S, Marshall M, Khanna A, Eddy SR, Bateman A. “Rfam:

annotating non-coding RNAs in complete genomes." Nucleic Acids Res. 2005, 33(Database issue):D121-4.

Höchsmann M, Töller T, Giegerich R, Kurtz S “Local similarity of RNA secondary structures." Proc of the IEEE Bioinformatics Conference 2003:159-168.

Hofacker IL, Fontana W, Bonhoeffer S, Stadler PF “Fast folding and comparison of RNA secondary structures." Monatshefte fur Chemie 1994, 125:167-188.

Hofacker I, Fekete M, Stadler P “Secondary structure prediction for aligned RNA sequences." Journal of Molecular Biology 2002, 319(5):1059-1066.

John R. Koza. “Genetic Programming On the Programming of Computers by Means of Natural Selection." MIT Press, 1992.

Kim N, Shiffedldrim N, Gan HH, Schlick T. “Candidates for Novel RNA Topologies"

J.Mol.Biol. 2004, 314:1129-1144

Klosterman P.S., Tamura M., Holbrook S.R., Brenner S.E. “SCOR: a structural classification of RNA database." Nucleic Acids Res. 2002, 30:392-394.

Knudsen B, Hein J “RNA secondary structure prediction using stochastic context-free grammars and evolutionary history." Bioinformatics 1999, 15(6):446-454.

Knudsen B, Hein J “Pfold: RNA secondary structure prediction using stochastic context-free grammars." Nucleic Acids Research 2003, 31(13):3423-3428.

Mathews D, Turner D “Dynalign: An algorithm for finding the secondary structure common to two RNA sequences." Journal of Molecular Biology 2002, 317(2):191-203.

Murty V.L. and Rose G.D. “RNABase: an annotated database of RNA structures."

Nucleic Acids Res. 2003, 31, 502-504.

Perriquet O, Touzet H, Dauchet M “Finding the common structure shared by two homologous RNAs." Bioinformatics 2003, 19:108-116.

Reeder J, Giegerich R. “Consensus shapes: an alternative to the Sankoff algorithm for RNA consensus structure prection." Bioinformatics 2005, 21(17):3516-3523.

Ruan J, Stormo G, Zhang W “An iterated loop matching approach to the prediction of RNA secondary structures with pseudoknots." Bioinformatics 2004, 20:58-66.

Sankoff, D. “Simultaneous solution of the RNA folding, alignment and proto sequence problems." SIAM J. Appl. Math. 1985, 45:810-25.

Siebert S, Backofen R “MARNA A server for multiple alignment of RNAs." In Proceedings of the German Conference on Bioinformatics 2003:135-140.

Sprinzl M, Steegborn C, Hubel F, Steinberg S. “Compilation of tRNA sequences and sequences of tRNA genes." Nucleic Acids Res. 1996, 24(1):68-72.

Steffen P, Voβ B, Rehmsmeier M, Reeder J, Giegerich Robert. “RNAshapes:

anintegrated RNA analysis package based on abstract shapes." Bioinformatics 2006, 22(4):500-503.

Szymanski M, Barciszewska MZ, Erdmann VA, Barciszewski J. “5S Ribosomal RNA Database." Nucleic Acids Res. 2002, 30(1):176-8.

Tamura M., Hendrix D.K., Klosterman P.S., Dchimmelman N.R.B., Brenner S.E. and Holbrook S.R. “SCOR: Structural Classification of RNA, Version 2.0." 2004, 32:182-184.

Touzet H, Perriquet O “CARNAC: folding families of relatedn RNAs." Nucleic Acids Res. 2004, 32(Web Server issue):W142-145.

Van Batenburg FH, Gultyaev AP, Pleij CW, Ng J, Oliehoek J. "PseudoBase: a database with RNA pseudoknots." Nucleic Acids Res. 2000, 28(1):201-4.

Zuker M, Stiegler P “Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information." Nucleic Acids Research 1981, 9:133-148.

Zuker M. “On finding all suboptimal foldings of an RNA molecule." Science. 1989, 244(4900):48-52.

Zuker M. “Mfold web server for nucleic acid folding and hybridization prediction."

Nucleic Acids Res. 2003, 31(13):3406-15

相關文件