在本研究所提之方法中,每一個步驟都具有調整的彈性。3.4 節中我們將基因表現資料 分成 4 種不同的表現程度,但是這並不一定是個準則。分階選擇的不同會影響實驗的 精確度,因此,將來可以利用其他的方式(如 SOM)選用不同的分階,討論不同的分階 對於實驗所帶來的影響;3.5 節中,我們使用全局排比(global alignment)將轉換後之基 因資料作排比。在未來的研究中,可以選用局部排比(local alignment)或者半全局排比 (semi-global alignment)來擷取部份的基因表現,讓基因間的關係更為彈性;評分矩陣的 選用,往往是依據經驗法則,如 3.5 節中,我們的評分矩陣亦是如此。但是隨著已知調 控模組的增加,我們可以使用其他的學習方法(如基因規劃法,Genetic Programming),
修正評分矩陣,如此必定可以大幅度改進實驗預測之準確性。
如 2.3 節所述,生物上的許多問題,往往無法單獨倚靠計算機或統計等方法解決,需要 加入生物上的資訊才能獲得更好的結果。如 TNP 利用 SCPD 中整理之調控因子結合區 序列,掃描基因之上游區,作為實驗方法得前處理,如此大幅度降低其 false positive,
進而提升預測的準確程度。因此,加入適當之生物資訊,亦可是將來繼續努力方向之 一。
結合多種實驗(或方法)重建調控網路,一直被許多研究人員所採用的方式。Hsu 將 TNP 與 PROSPECT(Fujibuchi et al. 2001)的結合、本研究與 TNP 之結合。將適當的實驗方法 結合,亦可改善實驗的結果。因此,如何將前人研究之優點,與本研究結合,亦是未 來可行的努力目標。
附錄一:26 個轉錄調控模組,其字串排比之得分分布。
GCN4
MATalpha2
PUT3
Repressor of CAR1
0
SWI5
0 500 1000 1500
-50 -40 -30 -20 -10 0 10 20 30 40 50 得分
個數
TBP
0 500 1000 1500 2000
-50 -40 -30 -20 -10 0 10 20 30 40 50 得分
個數
參考文獻
Alon U, Barkai N, Notterman DA, Gish K, Ybarra S, Mack D and Levine AJ. (1999) Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc Natl Acad Sci USA, 96(12):6745-6750.
Butte AJ, Kohane IS (2000) Mutual Information Relevance Networks:Functional Genomic Clustering Using Pairwise Entropy Measurements.PSB 2000
Cunningham MJ, Liang S, Fuhrman S, Seilhamer JJ and Somogyi R (2000) Gene Expression Microarray Data Analysis for Toxicology Profiling. Annals of the New York Academy of Sciences, 919: 52-67
Cho RJ, Campbell MJ, Winzeler EA, Steinmetz L, Conway A, Wodicka L, Wolfsberg TG, Gabrielian AE, Landsman D, Lockhart DJ, Davis RW(1998) A Genome-Wide Transcriptional Analysis of the Mitotic Cell Cycle. Molecular Cell,2,65–73
D'haeseleer P, Wen X, Fuhrman S, Somogyi R (1999) Linear modeling of mRna expression levels during CNS development and injury. PSB 1999.
Eisen MB, Spellman PT, Brown PO and Botstein D (1998) Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci USA, 95(25): 14863-14868.
Fuhrman S, Cunningham MJ, Wen X, Zweiger G, Seilhamer JJ, and Somogyi R. (2000) The Application of Shannon Entropy in the Identification of Putative Drug Targets.
Biosystems,55,5-14
Fujibuchi W, Anderson JS, and Landsman D (2001) PROSPECT Improves Cis-Acting Regulatory Element Prediction by Integrating Expression Profile Data with Consensus Pattern Searches. Nucleic Acids Res., 29(19): 3988-3996.
Hsu YZ and Hu YJ. (2004) Combining correlations between gene expression profiles with binding sites to reconstruct transcription networks.
Ji L. and Tan KL (2005) Identifying time-lagged gene clusters using gene expression data.
Bioinformatics ,21,4,509–516
Kuruvilla FG, Park PJ and Schreiber SL(2002) Vector algebra in the analysis of genome-wide expression data. Genome Biology,3,3
Kwon AT, Hoos HH and Ng R(2003) Inference of transcriptional regulation relationships from gene expression data. Bioinformatics ,19,905–912
Lewis D and Gale W (1994) A sequential algorithm for training text classifiers.
SIGIR1994,3-12
Lee PH and Lee D (2005) Modularized learning of genetic interaction networks from biological annotations and mRNA expression data. Bioinformatics 21,11,2739–2747
Liang S, Fuhrman S, Somogyi R (1998)REVEAL, a general reverse engineering algorithm for inference of genetic network architectures. PSB 1998
Liu TF, Sung WK, Mittal A(2004) Learning Multi-Time Delay Gene Network Using Bayesian Network Framework. Proceedings of the 16th IEEE International Conference on Tools with Artificial Intelligence (ICTAI 2004)
MacQueen JB (1967): "Some Methods for classification and Analysis of Multivariate Observations", Proceedings of 5-th Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, University of California Press, 1:281-297
Murphy K and Mian S(1999) Modelling Gene Expression Data using Dynamic Bayesian Networks.
Spellman PT, Sherlock G, Zhang MQ, Iyer VR, Anders K, Eisen MB, Brown PO, Botstein D, Futcher B.(1998) Comprehensive Identification of Cell Cycle–regulated Genes of the Yeast Saccharomyces cerevisiae by Microarray Hybridization. Molecular Biology of the Cell,9,3273–3297
Schmitt WA, Raab RM, and Stephanopoulos G (2004) Elucidation of Gene Interaction Networks Through Time-Lagged Correlation Analysis of Transcriptional Data. Genome Research,14,1654–1663
Segal E, Barash Y, Simon T, Friedman N, Koller D (2001) From Promoter Sequence to Expression: A Probabilistic Framework.
Tavazoie S, Hughes JD, Campbell MJ, Cho RJ, and Church GM (1999). Systematic Determination of Genetic Network Architecture. Nature Genet., 22:281-285.
Tamayo P, Slonim D, Mesirov J, Zhu O, Kitareewan S, Dmitrovsky E, and Golub TR (1999) Interpreting Patterns of Gene Expression with Self-Organizing Maps: Methods and Application to Hematopoietic Differentiation. Proc. Natl. Acad. Sci. USA, 96:2907-2912.
Tavazoie S, Hughes JD, Campbell MJ, Cho RJ and Church GM (1999) Systematic determination of genetic network architecture. Nat Genet 22:281-285
van Someren EP, Wesselsa LFA and Reindersa MJT. (2000) Linear Modeling of Genetic Networks from Experimental Data.
Wingender E, Dietze P, Karas H and Knuppel R (1996). TRANSFAC: A Database on
Transcription Factors and Their DNA Binding sites. Nucleic Acids Res., 24:238-241
Yu H, Luscombe NM, Qian J and Gerstein M (2003) Genomic analysis of gene expression relationships in transcriptional regulatory networks. TRENDS in Genetics,19,8,422-427 Zou M and Conzen SD (2005) A new dynamic Bayesian network (DBN) approach for identifying gene regulatory networks from time course microarray data.
Bioinformatics,21,71–79
Zhu J and Zhang MQ (1999) SCPD: a promoter database of the yeast Saccharomyces cerevisiae. Bioinformatics,15,607-611