Methyl-Typing: An improved and visualized COBRA software for epigenomic studies
Cheng-Hong Yang
a,b, Li-Yeh Chuang
c,*, Yu-Huei Cheng
a, De-Leung Gu
f, Chung-Ho Chen
d,e,
Hsueh-Wei Chang
f,g,h,* aDepartment of Electronic Engineering, National Kaohsiung University of Applied Sciences, Kaohsiung, Taiwan
b
Department of Network Systems, Toko University, Chiayi, Taiwan
cDepartment of Chemical Engineering, I-Shou University, Kaohsiung, Taiwan d
Department of Dentistry, College of Dental Medicine, Kaohsiung Medical University, Kaohsiung, Taiwan
e
Department of Oral and Maxillofacial Surgery, Kaohsiung Medical University Hospital, Kaohsiung, Taiwan
f
Department of Biomedical Science and Environmental Biology, Kaohsiung Medical University, Kaohsiung, Taiwan
g
Graduate Institute of Natural Products, Kaohsiung Medical University, Kaohsiung, Taiwan
h
Center of Excellence for Environmental Medicine, Kaohsiung Medical University, Kaohsiung, Taiwan
a r t i c l e
i n f o
Article history:
Received 14 October 2009 Revised 14 December 2009 Accepted 16 December 2009 Available online 22 December 2009
Edited by Paul Bertone
Keywords: Methylation COBRA Restriction enzyme Software Promoter
a b s t r a c t
Combined bisulfite restriction analysis (COBRA) is one of the most commonly used methylation quantification methods. However, it focuses on relatively few restriction enzymes. Here, we present Methyl-Typing, a web-based software that provides restriction enzyme mining data for methyl-cyto-sine-containing sequences following bisulfite-conversion. Gene names, accession numbers, sequences, PCR primers, and file upload are accessible for input. Promoter sequences and restriction enzymes for CpG- and GpC-containing recognition sites are retrieved. Four representative enzymes were tested successfully by COBRA on the experimental work. Therefore, the Methyl-Typing tool provides a comprehensive COBRA-restriction enzyme mining. It is freely available athttp://bio.kuas. edu.tw/methyl-typing.
Ó 2009 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved.
1. Introduction
Currently, combined bisulfite restriction analysis (COBRA)
[1]
is
one of the most commonly used methylation methods in
laborato-ries
[2]
. In principle, a technique using any kind of restriction
enzymes to distinguish between the methylated- and the
unme-thylated-sequences with bisulfite-conversion are regarded as the
COBRA method. However, the traditional COBRA approach only
re-lies on a few restriction enzymes, such as BstUI (5
0-CG
;CG-3
0)
[2]
and Taq
aI (5
0-T
;CGA-3
0)
[1]
. Other restriction enzymes available
for COBRA are less frequently mentioned, such as HinP1I (5
0-G
;CGC-3
0), HpyCH4IV (5
0-A
;CGT-3
0), and AciI (5
0-G
;CGG-3
0). This
may in part be due to the fact that restriction enzyme mining tools
for possible methylation sequences are poorly developed.
More-over, the traditional COBRA approach is not naturally specific to
CpG islands in the promoter region but it depends on the
user-de-fined sequence. Therefore, the integration for COBRA-restriction
enzyme mining, CpG island searching, and promoter prediction is
still challenging.
In order to circumvent this constraint, we have developed a
novel visualization software, named Methyl-Typing, which
pro-vides comprehensive restriction enzymes for
methyl-cytosine-con-taining sequences after bisulfite-conversion, i.e. unmethylated
cytosine converts to uracil (regarded as thymine for PCR
amplifica-tion) while 5-methylcytosine remains unchanged. Moreover, the
insulators’ CCCTC-binding factor (CTCF)-binding site database
(CTCFBSDB)
[3]
is implemented in Methyl-Typing. The insulators
of chromatin, such as CTCF, can block the activity of a down-stream
enhancer and are neutralized by methylation
[4]
, thereby
contrib-uting to gene regulation. In conclusion, Methyl-Typing is a fast and
efficient tool for providing all possible methylation sites of
restric-tion enzymes.
0014-5793/$36.00 Ó 2009 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved. doi:10.1016/j.febslet.2009.12.026
Abbreviations: COBRA, combined bisulfite restriction analysis; CTCF, CCCTC-binding factor; CTCFBSDB, CCCTC-CCCTC-binding factor-CCCTC-binding site database; RFLP, random fragment length polymorphism
* Corresponding authors. Address: Department of Biomedical Science and Environmental Biology, Kaohsiung Medical University, Kaohsiung, Taiwan (H.-W. Chang). Fax: +886 7 312 5339.
E-mail addresses:chuang@isu.edu.tw (L.-Y. Chuang), changhw@kmu.edu.tw
(H.-W. Chang).
FEBS Letters 584 (2010) 739–744
gest that the restriction enzymes provided by Methyl-Typing are
informative in determining the methylation status and in
applica-tion to epigenomic study. However, the possible methylaapplica-tion role
for oral oncogenesis still needs further investigation due to our
limited sample size.
4. Discussion
Several approaches are available to predict the location of CpG
islands
[8,12]
, provide promoter sequences
[6]
, and determine
DNA methylation status
[2]
. For example, some CpG island search
tools have been developed, but these tools do not provide mining
functions for COBRA-restriction enzymes and promoter
identifica-tion, such as CpG Island Searcher
[8]
and CpG analyzer
[12]
, which
are web server and Window-based programs, respectively.
More-over, many software tools have been developed only to analyze
the bisulfite sequencing data. For example, BiQ Analyzer
[13]
and
BDPC web server
[2]
provide visualization and quality control for
DNA methylation data from bisulfite sequencing. MethTools
[14]
and CyMATE
[15]
are web-based methylation analyzers for input
and email output. CyMATE and CpG PatternFinder
[16]
require
aligned sequences as input data. QUMA provides quantification
for methylation analysis by bisulfite sequencing
[17]
. However,
bisulfite sequencing-based methods are limited by the occurrence
of continuous nucleotides, such as poly Ts. The sequences behind
the region of poly T sequences are frequently frame-shifted or
ap-pear as multiple nucleotides at the same locus (data not shown),
leading to misrepresentation by computation. This potential poly
T problem can be detected in silico in Methyl-Typing (
Fig. 2
G).
Another disadvantage for those software tools is that the
inte-gration of CpG island search, promoter sequence retrieval, and
methylation analysis is unavailable. In contrast, our proposed
Methyl-Typing tool provides such an integrated system. Gene
name, accession number, sequence and file inputs as well as
pri-mer input for ePCR-generating sequence are all acceptable for
Methyl-Typing analysis. Recently, the MethMarker tool
[18]
was
developed to implement the COBRA assay and five other widely
used experimental techniques, providing the optimization of
gene-specific DNA methylation assays. However, only 35 different
kinds of COBRA enzymes were included in the MethMarker while
all restriction enzymes in REBASE version 806 (total 3961) are
available in our proposed Methyl-Typing.
While the majority (about 90%) of methylated cytosine residues
in mammals are found at CpG dinucleotides
[19,20]
, the GpC can
still be slowly, although not fully, methylated
[21]
. Methylations
in non-CpG sequences were found in plants
[15]
, fishes
[22]
, and
mammals
[23]
as well as in human breast cancer
[24]
. A high
per-centage of non-CpG DNA methylation in mammals was also found
in embryonic stem (ES) cells in contrast to somatic cells
[25]
. It is
consistent with our results that some oral cancer samples were
methylated at both CpG (
Fig. 3
A–C) and GpC sites (
Fig. 3
D).
Fur-thermore, DNA methylation on human repetitive sequences was
recently reported including AluI (5
0-AG
;CT-3
0)
[26]
. Therefore,
non-CpG methylation may play an important role in gene
expres-sion
[26]
and cancer oncogenesis
[27]
.
While most methylation software tools focus nearly exclusively
on CpG sites, few tools are designed for DNA with non-CpG
meth-ylation sites. In contrast, Methyl-Typing can provide all the
restric-tion enzymes for the recognirestric-tion sequences for all possible CpG
sites, e.g., Taq
aI (5
0-T
;CGA-3
0), HinP1I (5
0-G
;CGC-3
0), HpyCH4IV
(5
0-A
;CGT-3
0), AciI (5
0-G
;CGG-3
0) and for all possible GpC sites,
e.g., HaeIII (5
0-GG
;CC-3
0), Cac8I (5
0-GCN
;NGC-3
0), and AluI (5
0-AG
;CT-3
0). Therefore, the Methyl-Typing provides a more
compre-hensive source of restriction enzymes for COBRA assay for both
CpG and GpC methylations than the traditional COBRA.
In summary, we describe an improved COBRA software
provid-ing restriction enzyme minprovid-ing and visualization of
methyl-cyto-sine-containing sequences after bisulfite-conversion.
Methyl-Typing improves the scope of the entire analysis of traditional
CO-BRA methylation typing and provides visualization platforms
across different computing environments.
Acknowledgements
This work was partly supported by the National Science Council
in Taiwan under grant NSC98-2221-E-151-040,
98-2622-E-151-024-CC3, NSC97-2311-B-037-003-MY3,
NSC97-2622-E-151-008-CC2, and by the grants KMU-EM-97-1.1b and KMU-EM-98-1.4.
References
[1] Xiong, Z. and Laird, P.W. (1997) COBRA: a sensitive and quantitative DNA methylation assay. Nucleic Acids Res. 25, 2532–2534.
[2] Rohde, C., Zhang, Y., Jurkowski, T.P., Stamerjohanns, H., Reinhardt, R. and Jeltsch, A. (2008) Bisulfite sequencing Data Presentation and Compilation (BDPC) web server – a useful tool for DNA methylation analysis. Nucleic Acids Res. 36, e34.
[3] Bao, L., Zhou, M. and Cui, Y. (2008) CTCFBSDB: a CTCF-binding site database for characterization of vertebrate genomic insulators. Nucleic Acids Res. 36, D83– D87.
[4] Gaszner, M. and Felsenfeld, G. (2006) Insulators: exploiting transcriptional and epigenetic mechanisms. Nat. Rev. Genet. 7, 703–713.
[5] Karolchik, D. et al. (2008) The UCSC Genome Browser Database: 2008 update. Nucleic Acids Res. 36, D773–D779.
[6] Wakaguri, H., Yamashita, R., Suzuki, Y., Sugano, S. and Nakai, K. (2008) DBTSS: database of transcription start sites, progress report 2008. Nucleic Acids Res. 36, D97–D101.
[7] Roberts, R.J., Vincze, T., Posfai, J. and Macelis, D. (2007) REBASE – enzymes and genes for DNA restriction and modification. Nucleic Acids Res. 35, D269–D270. [8] Takai, D. and Jones, P.A. (2003) The CpG island searcher: a new WWW
resource. In Silico Biol. 3, 235–240.
[9] Chang, H.W. et al. (2008) High-throughput gender identification of Accipitridae eagles with real-time PCR using TaqMan probes. Theriogenology 70, 83–90.
[10] Chang, H.W., Ali, S.Z., Cho, S.K., Kurman, R.J. and Shih Ie, M. (2002) Detection of allelic imbalance in ascitic supernatant by digital single nucleotide polymorphism analysis. Clin. Cancer Res. 8, 2580–2585.
[11] Frommer, M., McDonald, L.E., Millar, D.S., Collis, C.M., Watt, F., Grigg, G.W., Molloy, P.L. and Paul, C.L. (1992) A genomic sequencing protocol that yields a positive display of 5-methylcytosine residues in individual DNA strands. Proc. Natl. Acad. Sci. U.S.A. 89, 1827–1831.
[12] Xu, Y.H., Manoharan, H.T. and Pitot, H.C. (2005) CpG analyzer, a Windows-based utility program for investigation of DNA methylation. Biotechniques 39. 656, 658, 660 passim.
[13] Bock, C., Reither, S., Mikeska, T., Paulsen, M., Walter, J. and Lengauer, T. (2005) BiQ analyzer: visualization and quality control for DNA methylation data from bisulfite sequencing. Bioinformatics 21, 4067–4068.
[14] Grunau, C., Schattevoy, R., Mache, N. and Rosenthal, A. (2000) MethTools – a toolbox to visualize and analyze DNA methylation data. Nucleic Acids Res. 28, 1053–1058.
[15] Hetzl, J., Foerster, A.M., Raidl, G. and Mittelsten Scheid, O. (2007) CyMATE: a new tool for methylation analysis of plant genomic DNA after bisulphite sequencing. Plant J. 51, 526–536.
[16] Xu, Y.H., Manoharan, H.T. and Pitot, H.C. (2007) CpG PatternFinder: a Windows-based utility program for easy and rapid identification of the CpG methylation status of DNA. Biotechniques 43. 334, 336–340, 342.
[17] Kumaki, Y., Oda, M. and Okano, M. (2008) QUMA: quantification tool for methylation analysis. Nucleic Acids Res. 36, W170–W175.
[18] Schuffler, P., Mikeska, T., Waha, A., Lengauer, T. and Bock, C. (2009) MethMarker: user-friendly design and optimization of gene-specific DNA methylation assays. Genome Biol. 10, R105.
[19] Grippo, P., Iaccarino, M., Parisi, E. and Scarano, E. (1968) Methylation of DNA in developing sea urchin embryos. J. Mol. Biol. 36, 195–208.
[20] Doskocil, J. and Sorm, F. (1962) Distribution of 5-methylcytosine in pyrimidine sequences of deoxyribonucleic acids. Biochim. Biophys. Acta 55, 953–959. [21] Simon, D., Grunert, F., von Acken, U., Doring, H.P. and Kroger, H. (1978)
DNA-methylase from regenerating rat liver: purification and characterisation. Nucleic Acids Res. 5, 2153–2167.
[22] Pontecorvo, G., De Felice, B. and Carfagna, M. (2000) Novel methylation at GpC dinucleotide in the fish Sparus aurata genome. Mol. Biol. Rep. 27, 225–230. [23] Kouidou, S., Malousi, A. and Maglaveras, N. (2006) Methylation and repeats in
silent and nonsense mutations of p53. Mutat. Res. 599, 167–177.
[24] Rodenhiser, D., Chakraborty, P., Andrews, J., Ainsworth, P., Mancini, D., Lopes, E. and Singh, S. (1996) Heterogenous point mutations in the BRCA1 breast cancer susceptibility gene occur in high frequency at the site of C.-H. Yang et al. / FEBS Letters 584 (2010) 739–744 743
homonucleotide tracts, short repeats and methylatable CpG/CpNpG motifs. Oncogene 12, 2623–2629.
[25] Ramsahoye, B.H., Biniszkiewicz, D., Lyko, F., Clark, V., Bird, A.P. and Jaenisch, R. (2000) Non-CpG methylation is prevalent in embryonic stem cells and may be mediated by DNA methyltransferase 3a. Proc. Natl. Acad. Sci. U.S.A. 97, 5237– 5242.
[26] Horard, B. et al. (2009) Global analysis of DNA methylation and transcription of human repetitive sequences. Epigenetics 4, 339–350.
[27] Lee, H.S. et al. (2009) Prognostic implications of and relationship between CpG island hypermethylation and repetitive DNA hypomethylation in hepatocellular carcinoma. Clin. Cancer Res. 15, 812–820.