We developed a method to identify cell cycle TFs in yeast by integrating the ChIP-chip [11]
and cell cycle gene expression data [19]. We identified 15 cell cycle TFs, 12 of which are known cell cycle TFs. The remaining three TFs (Hap4, Reb1 and Tye7) are putative novel cell cycle TFs. Our predictions are supported by the interaction (physical or genetic) data and previous studies. In addition, for seven of the 15 identified cell cycle TFs, our method can assign a specific cell cycle phase in which the TFs function. On average, 86% of our predictions have literature support (57% with experimental evidence and 29% with computational evidence). Besides, a high-confidence TF-gene regulatory matrix is derived as a byproduct of our method. Each TF-gene regulatory relationship in this matrix is supported by the ChIP-chip and gene expression data. Moreover, we compared the performance of our method with five existing methods and showed that our method has a better ability to retrieve the known cell cycle TFs. Finally, applying our method to different cell cycle gene expression datasets, we identify similar sets of TFs, suggesting that our method is robust.
Figure 1: Flowchart of the procedure of our method TF-promoter binding matrix B
no
Relative R2 method
ChIP-chip data with p-value cutoff=0.001
TF-gene regulatory matrix C
Classified as a cell cycle TF
Classified as a non-cell cycle TF
yes
Cell cycle gene expression data
Check if a significant portion of a TF’s regulatory targets are cell cycle-regulated genes
Classified as a X phase cell cycle TF Check if a significant portion of
a TF’s regulatory targets are X
phase cell cycle-regulated genes
no
yes
Classified as a non- X phase cell cycle
TF
Figure 2: Interactions between a novel cell cycle TF and the other identified cell cycle TFs
The physical or genetic interactions between a novel cell cycle TF ((a) Reb1, (b) Tye7, and (c) Hap4) and the other identified cell cycle TFs are shown. Each oval indicates an identified cell cycle TF. A TF name is colored purple if it is a known cell cycle TF [18] but black otherwise.
Two ovals are connected by an undirected red line if these two TFs have physical interactions indicated by the current protein-protein interaction data [34]. Two ovals are connected by a directed blue line if the two TFs have genetic interactions indicated by ChIP-chip or/and mutant data [25]. For example, Reb1 Swi5 means that either TF Reb1 binds to the promoter of gene SWI5 or the disruption of TF Reb1 results in a significant change of the expression of gene SWI5.
Figure 3: The results of using different cell cycle gene expression datasets
Our method identified 15 and 18 cell cycle TFs using Pramila et al.’s alpha30 dataset and alpha38 dataset [19]. Both datasets have a sampling interval of 5 minutes and a total of 25 data points for each gene in the yeast genome. We found that among the 15 cell cycle TFs identified using alpha30 dataset, 10 TFs are also identified using alpha38 dataset. This suggests that our method is robust against different cell cycle gene expression datasets.
Ace2 Mcm1 Fkh1 Swi4 Fkh2 Swi5 Hir3 Swi6 Mbp1 Yox1 Fhl1 Cin5
Ino2 Rap1 Met32 Ume6 Yap1
Abf1 Hap4 Stb1 Reb1 Tye7 Alpha30 dataset Alpha38 dataset
Known (Novel) cell cycle TFs are colored red (black).
Table 1: The 15 identified cell cycle TFs
The twelve known cell cycle TFs (according to the MIPS database [18]) are bold-faced and colored blue. The 15 identified TFs are ordered by the confidence of being cell cycle TFs (according to the hypergeometric p-value calculated using Equation (9)). For seven of the 15 identified cell cycle TFs, the cell cycle phase in which the TFs function are shown. “E” means that the prediction is supported by experimental evidence, “C” means that the prediction is supported by previous computational studies, and “N” stands for our novel prediction.
TF name Hypergeometric p-value MG1 G1 S SG2 G2M
Table 2: Known cell cycle genes and proteins that have genetic or physical interactions with the three novel cell cycle TFs (Reb1, Tye7, and Hap4)
Known cell cycle genes which are
Table 3: Performance comparison of six cell cycle TF identification methods to retrieve the known cell cycle TFs annotated in the MIPS database
Performance comparison was based on the Jaccard similarity score [21], which scores the overlaps between a method’s output and the list of known cell cycle TFs. Specifically, the Jaccard similarity score is defined as TP/(TP+FP+FN), where TP stands for true positives, FP for false positives, and FN for false negatives. Note that the higher the Jaccard similarity score, the better the ability of a method to retrieve the known cell cycle TFs.
TP FP FN Jaccard similarity score
Table 5 : List of 203 TF from Harbison et al.
YGR067c YHP1 YJL206c YKL222c OAF3 YLR278c YML081w
YNR063w YOX1 YPR022c YPR196w YRR1 ZAP1 ZMS1
Table 6 : Cell cycle TFs which is identified by the six methods.
The number of true positives, false positives, and false negatives are expressed as (TP, FP, FN). The known cell cycle TFs (according to the MIPS database) are colored red.
Our method
References
1. Alter O, Brown PO, Botstein D: Singular value decomposition for genome-wide expression data processing and modeling. Proc Natl Acad Sci USA 2000, 97:10101-10106.
2. Andersson CR, Hvidsten TR, Isaksson A, Gustafsson MG, Komorowski J: Revealing cell cycle control by combining model-based detection of periodic expression with novel cis-regulatory descriptors. BMC Syst Biol 2007, 1:45.
3. Bähler J: Cell-cycle control of gene expression in budding and fission yeast. Annu Rev Genet 2005, 39:69-94.
4. Breeden LL: Periodic transcription: a cycle within a cycle. Curr Biol 2003, 13(1):R31-38.
5. Cheng C, Li LM: Systematic identification of cell cycle regulated transcription factors from microarray time series data. BMC Genomics 2008, 9:116.
6. Cho RJ, Campbell MJ, Winzeler EA, Steinmetz L, Conway A, Wodicka L, Wolfsberg TG, Gabrielian AE, Landsman D, Lockhart DJ, Davis RW: A genome-wide transcriptional analysis of the mitotic cell cycle. Mol Cell 1998, 2(1):65-73.
7. Cokus S, Rose S, Haynor D, Grønbech-Jensen N, Pellegrini M: Modelling the network of cell cycle transcription factors in the yeast Saccharomyces cerevisiae. BMC Bioinformatics 2006, 7:381.
8. de Lichtenberg U, Jensen LJ, Fausbøll A, Jensen TS, Bork P, Brunak S: Comparison of computational methods for the identification of cell cycle-regulated genes.
Bioinformatics 2005, 21(7):1164-1171.
9. Dohrmann PR, Butler G, Tamai K, Dorland S, Greene JR, Thiele DJ, Stillman DJ:
Parallel pathways of gene regulation: homologous regulators SWI5 and ACE2 differentially control transcription of HO and chitinase. Genes Dev 1992,
6(1):93-104.
10. Fellenberg K, Hauser NC, Brors B, Neutzner A, Hoheisel JD, Vingron M:
Correspondence analysis applied to microarray data. Proc Natl Acad Sci USA 2001, 98:10781-10786.
11. Harbison CT, Gordon DB, Lee TI, Rinaldi NJ, Macisaac KD, Danford TW, Hannett NM, Tagne JB, Reynolds DB, Yoo J, Jennings EG, Zeitlinger J, Pokholok DK, Kellis M, Rolfe PA, Takusagawa KT, Lander ES, Gifford DK, Fraenkel E, Young RA: Transcriptional regulatory code of a eukaryotic genome. Nature 2004, 431:99-104.
12. Heyer LJ, Kruglyak S, Yooseph S: Exploring expression data: identification and analysis of coexpressed genes. Genome Res 1999, 9:1106-1115.
13. Johansson D, Lindgren P, Berglund A: A Multivariate Approach Applied to Microarray data for Identification of Genes with Cell-Cycle Coupled Transcription.
Bioinformatics 2003, 19(4):467-473.
14. Klevecz RR: Dynamic architecture of the yeast cell cycle uncovered by wavelet decomposition of expressionmicroarray data. Funct Integr Genomics 2000, 1:186-192.
15. Laabs TL, Markwardt DD, Slattery MG, Newcomb LL, Stillman DJ, Heideman W:
ACE2 is required for daughter cell-specific G1 delay in Saccharomyces cerevisiae.
Proc Natl Acad Sci USA 2003, 100(18):10275-10280.
16. Lee TI, Rinaldi NJ, Robert F, Odom DT, Bar-Joseph Z, Gerber GK, Hannett NM, Harbison CR, Thompson CM, Simon I, Zeitlinger J, Jennings EG, Murray HL, Gordon DB, Ren B, Wyrick JJ, Tagne J, Volkert TL, Fraenkel E, Gifford DK, Young RA:
Transcriptional Regulatory Networks in Saccharomyces cerevisiae. Science 2002, 298:799-804.
17. Mendenhall W, Sincich T: Statistics for Engineering and the Sciences, 4th edition.
Englewood Cliffs: Prentice-Hall; 1995.
18. Mewes HW, Frishman D, Guldener U, Mannhaupt G, Mayer K, Mokrejs M, Morgenstern B, Munsterkotter M, Rudd S, Weil B: MIPS: a database for genomes and protein sequences. Nucleic Acids Res 2002, 30:31-34.
19. Pramila T, Wu W, Miles S, Noble WS, Breeden LL: The Forkhead transcription factor Hcm1 regulates chromosome segregation genes and fills the S-phase gap in the transcriptional circuitry of the cell cycle. Genes Dev 2006, 20(16):2266-2278.
20. Rowicka M, Kudlicki A, Tu BP, Otwinowski Z: High-resolution timing of cell cycle-regulated gene expression.Proc Natl Acad Sci USA 2007, 104(43):16892-16897.
21. Shakhnovich BE, Reddy TE, Galinsky K, Mellor J, Delisi C: Comparisons of predicted genetic modules: identification of co-expressed genes through module gene flow.
Genome Inform Ser Workshop Genome Inform 2004, 15:221-228.
22. Simon I, Barnett J, Hannett N, Harbison C, Rinaldi N, Volkert T, Wyrick J, Zeitlinger J, Gifford DK, Jaakkola TS, Young RA: Serial regulation of transcriptional regulators in the yeast cell cycle. Cell 2001, 106:697-708.
23. Spellman PT, Sherlock G, Zhang MQ, Iyer VR, Anders K, Eisen MB, Brown PO, Botstein D, Futcher B: Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Mol Biol Cell 1998, 9:3273-3297.
24. Tavazoie S, Hughes JD, Campbell MJ, Cho RJ, Church GM: Systematic determination of genetic network architecture. Nature Genet 1999, 22:281-285.
25. Teixeira MC, Monteiro P, Jain P, Tenreiro S, Fernandes AR, Mira NP, Alenquer M, Freitas AT, Oliveira AL, Sá-Correia I: The YEASTRACT database: a tool for the analysis of transcription regulatory associations in Saccharomyces cerevisiae. Nucl Acids Res 2006, 34:D446-D451.
26. Tsai HK, Lu HH, Li WH: Statistical methods for identifying yeast cell cycle transcription factors. Proc Natl Acad Sci USA 2005, 102:13532-12537.
27. Wang H, Li WH: Increasing MicroRNA Target Prediction Confidence by the Relative R-squared Method. J Theor Biol 2009, 259:793-798.
28. Whitfield ML,Sherlock G,Saldanha AJ,Murray JI,Ball CA,Alexander KE, Matese JC, Perou CM, Hurt MM, Brown PO,David Botstein D: Identification of gene periodically expressed in the human cell cycle and their expression in tumors. Mol Biol Cell 2002, 13(6):1977-2000.
29. Wu WS, Chen BS: Identifying stress transcription factors using gene expression and TF-gene association data. Bioinformatics and Biology Insights 2007, 1:9-17
30. Wu WS, Li WH, Chen BS: Computational reconstruction of transcriptional regulatory modules of the yeast cell cycle. BMC Bioinformatics 2006,7:421.
31. Wu WS, Li WH, Chen BS: Identifying regulatory targets of cell cycle transcription factors using gene expression and ChIP-chip data. BMC Bioinformatics 2007, 8:188.
32. Wu WS, Li WH: Identifying gene regulatory modules of heat shock response in yeast.
BMC Genomics 2008,9:439.
33. Wu WS, Li WH: Systematic identification of yeast cell cycle transcription factors using multiple data sources. BMC Bioinformatics 2008,9:522.
34. Wu X, Zhu L, Guo J, Fu C, Zhou H, Dong D, Li Z, Zhang DY, Lin K: SPIDer:
Saccharomyces protein-protein interaction database. BMC Bioinformatics 2006,
7:S16.
35. Yang YL, Suen J, Brynildsen MP, Galbraith SJ, Liao JC: Inferring yeast cell cycle regulators and interactions using transcription factor activities. BMC Genomics 2005, 6(1):90.
36. Zhao LP, Prentice R, Breeden L: Statistical modeling of large microarray data sets to identify stimulus response profiles. Proc Natl Acad Sci USA 2001, 98:5631-5636.