This work presents an integrated system to comprehensively annotate the regulatory features for the selected promoter region of human, mouse, rat, dog and chimpanzee genes. Besides, the system takes advantage of several promoter identification programs to determine the putative transcription start sites and annotate the regulatory features for the user inputted sequence. The cross-species analysis of homologous gene promoter sequences is provided for observing the conserved regions and the conserved regulatory features in promoter regions. The conservation of homologous gene promoter sequences increases the impact of the regulatory features on the gene transcription. Moreover, the comprehensive regulatory features and conserved promoter regions are represented in graphical visualization.
References
1. Blanchette, M., et al., Genome-wide computational prediction of
transcriptional regulatory modules reveals new insights into human gene expression. Genome Res, 2006. 16(5): p. 656-68.
2. Landry, J.R., D.L. Mager, and B.T. Wilhelm, Complex controls: the role of alternative promoters in mammalian genomes. Trends Genet, 2003. 19(11): p.
640-8.
3. Pastinen, T. and T.J. Hudson, Cis-acting regulatory variation in the human genome. Science, 2004. 306(5696): p. 647-50.
4. van de Lagemaat, L.N., et al., Transposable elements in mammals promote regulatory variation and diversification of genes with specialized functions.
Trends Genet, 2003. 19(10): p. 530-6.
5. Lemon, B. and R. Tjian, Orchestrated response: a symphony of transcription factors for gene control. Genes Dev, 2000. 14(20): p. 2551-69.
6. Hsieh, J. and F.H. Gage, Epigenetic control of neural stem cell fate. Curr Opin Genet Dev, 2004. 14(5): p. 461-9.
7. Kanhere, A. and M. Bansal, Structural properties of promoters: similarities and differences between prokaryotes and eukaryotes. Nucleic Acids Res, 2005.
33(10): p. 3165-75.
8. Bajic, V.B., et al., Promoter prediction analysis on the whole human genome.
Nat Biotechnol, 2004. 22(11): p. 1467-73.
9. Fry, C.J. and C.L. Peterson, Transcription. Unlocking the gates to gene expression. Science, 2002. 295(5561): p. 1847-8.
10. Klose, R.J. and A.P. Bird, Genomic DNA methylation: the mark and its mediators. Trends Biochem Sci, 2006. 31(2): p. 89-97.
11. Bird, A., DNA methylation patterns and epigenetic memory. Genes Dev, 2002.
16(1): p. 6-21.
12. Enright, A.J., et al., MicroRNA targets in Drosophila. Genome Biol, 2003. 5(1):
p. R1.
13. Caiafa, P. and M. Zampieri, DNA methylation and chromatin structure: the puzzling CpG islands. J Cell Biochem, 2005. 94(2): p. 257-65.
14. Strathdee, G., A. Sim, and R. Brown, Control of gene expression by CpG island methylation in normal cells. Biochem Soc Trans, 2004. 32(Pt 6): p.
913-5.
15. Jones, P.A. and S.B. Baylin, The fundamental role of epigenetic events in cancer. Nat Rev Genet, 2002. 3(6): p. 415-28.
16. Kanhere, A. and M. Bansal, A novel method for prokaryotic promoter
17. SantaLucia, J., Jr., A unified view of polymer, dumbbell, and oligonucleotide DNA nearest-neighbor thermodynamics. Proc Natl Acad Sci U S A, 1998.
95(4): p. 1460-5.
18. Morris, K.V., et al., Small interfering RNA-induced transcriptional gene silencing in human cells. Science, 2004. 305(5688): p. 1289-92.
19. Hutvagner, G., et al., A cellular function for the RNA-interference enzyme Dicer in the maturation of the let-7 small temporal RNA. Science, 2001.
293(5531): p. 834-8.
20. Griffiths-Jones, S., et al., miRBase: microRNA sequences, targets and gene nomenclature. Nucleic Acids Res, 2006. 34(Database issue): p. D140-4.
21. Thompson, J.D., D.G. Higgins, and T.J. Gibson, CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res, 1994. 22(22): p. 4673-80.
22. Blanchette, M. and M. Tompa, FootPrinter: A program designed for phylogenetic footprinting. Nucleic Acids Res, 2003. 31(13): p. 3840-2.
23. Yamashita, R., et al., DBTSS: DataBase of Human Transcription Start Sites, progress report 2006. Nucleic Acids Res, 2006. 34(Database issue): p. D86-9.
24. Schmid, C.D., et al., The Eukaryotic Promoter Database EPD: the impact of in silico primer extension. Nucleic Acids Res, 2004. 32(Database issue): p.
D82-5.
25. Suzuki, Y. and S. Sugano, Construction of a full-length enriched and a 5'-end enriched cDNA library using the oligo-capping method. Methods Mol Biol, 2003. 221: p. 73-91.
26. Suzuki, Y., et al., DBTSS, DataBase of Transcriptional Start Sites: progress report 2004. Nucleic Acids Res, 2004. 32(Database issue): p. D78-81.
27. Wingender, E., et al., TRANSFAC: an integrated system for gene expression regulation. Nucleic Acids Res, 2000. 28(1): p. 316-9.
28. Kel, A.E., et al., MATCH: A tool for searching transcription factor binding sites in DNA sequences. Nucleic Acids Res, 2003. 31(13): p. 3576-9.
29. Halees, A.S., D. Leyfer, and Z. Weng, PromoSer: A large-scale mammalian promoter and transcription start site identification service. Nucleic Acids Res, 2003. 31(13): p. 3554-9.
30. Sun, H., et al., MPromDb: an integrated resource for annotation and
visualization of mammalian gene promoters and ChIP-chip experimental data.
Nucleic Acids Res, 2006. 34(Database issue): p. D98-103.
31. Barta, E., et al., DoOP: Databases of Orthologous Promoters, collections of clusters of orthologous upstream sequences from chordates and plants.
Nucleic Acids Res, 2005. 33(Database issue): p. D86-90.
32. Dieterich, C., et al., CORG: a database for COmparative Regulatory
Genomics. Nucleic Acids Res, 2003. 31(1): p. 55-7.
33. Chen, X., et al., TiProD: the Tissue-specific Promoter Database. Nucleic Acids Res, 2006. 34(Database issue): p. D104-7.
34. Kent, W.J., et al., The human genome browser at UCSC. Genome Res, 2002.
12(6): p. 996-1006.
35. Wheeler, D.L., et al., Database resources of the National Center for Biotechnology. Nucleic Acids Res, 2003. 31(1): p. 28-33.
36. Okazaki, Y., et al., Analysis of the mouse transcriptome based on functional annotation of 60,770 full-length cDNAs. Nature, 2002. 420(6915): p. 563-73.
37. Kent, W.J., BLAT--the BLAST-like alignment tool. Genome Res, 2002. 12(4): p.
656-64.
38. Davuluri, R.V., et al., CART classification of human 5' UTR sequences.
Genome Res, 2000. 10(11): p. 1807-16.
39. Heisler, L.E., et al., CpG Island microarray probe sequences derived from a physical library are representative of CpG Islands annotated on the human genome. Nucleic Acids Res, 2005. 33(9): p. 2952-61.
40. Odom, D.T., et al., Control of pancreas and liver gene expression by HNF transcription factors. Science, 2004. 303(5662): p. 1378-81.
41. Boyer, L.A., et al., Core transcriptional regulatory circuitry in human embryonic stem cells. Cell, 2005. 122(6): p. 947-56.
42. Pruitt, K.D., T. Tatusova, and D.R. Maglott, NCBI Reference Sequence project:
update and current status. Nucleic Acids Res, 2003. 31(1): p. 34-7.
43. Morgenstern, B., DIALIGN 2: improvement of the segment-to-segment approach to multiple sequence alignment. Bioinformatics, 1999. 15(3): p.
211-8.
44. Waterman, M.S. and M. Vingron, Rapid and accurate estimates of statistical significance for sequence data base searches. Proc Natl Acad Sci U S A, 1994.
91(11): p. 4625-8.
45. The Gene Ontology (GO) project in 2006. Nucleic Acids Res, 2006.
34(Database issue): p. D322-6.
46. Wu, X., et al., GBA server: EST-based digital gene expression profiling.
Nucleic Acids Res, 2005. 33(Web Server issue): p. W673-6.
47. Lal, A., et al., A public database for gene expression in human cancers.
Cancer Res, 1999. 59(21): p. 5403-7.
48. Lash, A.E., et al., SAGEmap: a public gene expression resource. Genome Res, 2000. 10(7): p. 1051-60.
49. Reese, M.G., Application of a time-delay neural network to promoter annotation in the Drosophila melanogaster genome. Comput Chem, 2001.
26(1): p. 51-6.
transcription start sites in mammalian genomic DNA. Genome Res, 2002.
12(3): p. 458-61.
51. Ponger, L. and D. Mouchiroud, CpGProD: identifying CpG islands associated with transcription start sites in large genomic mammalian sequences.
Bioinformatics, 2002. 18(4): p. 631-3.
52. Larsen, F., et al., CpG islands as gene markers in the human genome.
Genomics, 1992. 13(4): p. 1095-107.
53. Altschul, S.F., et al., Basic local alignment search tool. J Mol Biol, 1990.
215(3): p. 403-10.
54. Hubbard, T., et al., Ensembl 2005. Nucleic Acids Res, 2005. 33(Database issue): p. D447-53.
55. Pruitt, K.D., T. Tatusova, and D.R. Maglott, NCBI Reference Sequence
(RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res, 2005. 33(Database issue): p. D501-4.
56. Ohler, U., Promoter prediction on a genomic scale--the Adh experience.
Genome Res, 2000. 10(4): p. 539-42.
57. Stabenau, A., et al., The Ensembl core software libraries. Genome Res, 2004.
14(5): p. 929-33.
58. Benson, G., Tandem repeats finder: a program to analyze DNA sequences.
Nucleic Acids Res, 1999. 27(2): p. 573-80.
59. Narang, V., W.K. Sung, and A. Mittal, Computational modeling of oligonucleotide positional densities for human promoter prediction. Artif Intell Med, 2005. 35(1-2): p. 107-19.
60. Huang, H.D., et al., Identifying transcriptional regulatory sites in the human genome using an integrated system. Nucleic Acids Res, 2004. 32(6): p.
1948-56.
61. Batzer, M.A. and P.L. Deininger, Alu repeats and human genomic diversity.
Nat Rev Genet, 2002. 3(5): p. 370-9.
62. Han, J.S., S.T. Szak, and J.D. Boeke, Transcriptional disruption by the L1 retrotransposon and implications for mammalian transcriptomes. Nature, 2004. 429(6989): p. 268-74.
63. Bakin, A.V. and T. Curran, Role of DNA 5-methylcytosine transferase in cell transformation by fos. Science, 1999. 283(5400): p. 387-90.
64. Lagos-Quintana, M., et al., Identification of tissue-specific microRNAs from mouse. Curr Biol, 2002. 12(9): p. 735-9.