| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
Molecular Biology, Pathobiology, and Genetics |
1 Department of Otolaryngology-Head and Neck Surgery, The Johns Hopkins School of Medicine and 2 Department of Pathology, Johns Hopkins Medical Institutions, Baltimore, Maryland; Departments of 3 Gynecologic Oncology and 4 Pathology, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands; 5 Department of Pathology, Portuguese Oncology Institute, University of Porto, Porto, Portugal; 6 OncoMethylome Sciences S.A, CHU Niveau +4, Tour 4 dePharmacie (bâtiment 36), Liege, Belgium; 7 Laboratory of Molecular Medicine and Biotechnology, University Campus Bio-Medico School of Medicine, Rome, Italy; and 8 Bioinformatics and Computational Genomics (Biobix), Faculty of Agricultural and Applied Biological Sciences, University of Ghent, Ghent, Belgium
Requests for reprints: David Sidransky, Division of Head and Neck Cancer Research, The Johns Hopkins School of Medicine, 1550 Orleans Street, 5 North 03, Baltimore, MD 21231. Phone: 410-502-5155; Fax: 410-614-1411; E-mail: dsidrans{at}jhmi.edu and Wim Van Criekinge, OncoMethylome Sciences S.A, CHU Niveau +4, Tour 4 dePharmacie (bâtiment 36), Avenue de l'Hospital 14000, Sart-Tilman, Liege, Belgium. Phone: 32-0-436-698-60; Fax: 32-0-436-698-61; E-mail: Wim.vancriekinge{at}OncoMethylome.com.
| Abstract |
|---|
|
|
|---|
| Introduction |
|---|
|
|
|---|
Epigenetic modifications are defined as all meiotically and mitotically heritable changes in gene expression that are not coded in the DNA sequence itself. Methylation of the C5 positions of cytosine residues in DNA has long been recognized as an epigenetic silencing mechanism of fundamental importance (2, 3). DNA methylation alters chromosome structure, inhibits the binding of proteins, such as CTCF, and defines regions of transcriptional regulation (4). DNA methylation can also promote the binding of proteins, such as MECP2, MBD1, MBD2, MBD3, and MBD4, which induce histone modification (5).
CpG dinucleotides are found at increased frequency in the promoter region of many genes, and methylation in the promoter region is frequency associated with "gene silencing"; i.e., the gene is not expressed in the presence of methylation but is expressed in its absence (6). Both global hypomethylation and gene-specific promoter hypermethylation are associated with malignancy (7, 8). Several studies have shown that these epigenetic changes are an early event in carcinogenesis and are present in the precursor lesions of a variety of cancers including lung (9), head and neck (10), and colon (11).
Challenges in analyzing CpG Island (CGI) methylation include distinguishing islands from repetitive DNA sequences, which are usually heavily methylated, and identifying those that regulate gene expression. In an effort to identify important tumor suppressor genes (TSG) silenced by promoter methylation, genome-wide screening techniques to detect differences in DNA methylation were developed. Many of these studies documented that when CGI methylation in promoter regions is appropriately validated, expression of downstream genes is almost always found to be severely repressed or absent (12, 13).
In this study, we used advanced bioinformatics tools and robust data sets from cancer cell lines treated with demethylating agents to identify novel cancer-specific methylated genes. We then used bisulfite DNA sequencing, methylation-specific PCR (MSP), and quantitative MSP (QMSP) to confirm cancer-specific methylation in a large number of novel genes. Our results confirm computational prediction of methylated CpG sites in cancer through extensive experimentation. Moreover, this approach has greatly expanded our knowledge of methylated promoters in cancer cell lines and primary tumors, has led to the discovery of a substantial portion of "the cancer methylome", sets the stage for rapid and full elucidation of methylated gene targets and pathways in human cancer.
| Materials and Methods |
|---|
|
|
|---|
5-aza-2'-deoxycytidine treatment of cells. We seeded all cell lines (1 x 106) in their respective culture medium and maintained them for 24 h before treating them with 5 mol/L 5-aza-2'-deoxycytidine (5-aza-dC; Sigma) for 3 d. We renewed medium containing 5-aza-dC every 24 h during the treatment. We handled control cells the same way, without adding 5-aza-dC. Stock solutions of 5-aza-dC were dissolved in phosphate buffer saline PBS (pH 7.5). We prepared total RNA using the RNeasy Mini kit (Qiagen).
Biotinylated RNA probe preparation and hybridization. Several versions of Affymetrix arrays were used for gene expression profiling per the manufacturer's instruction. Hu95A.V2 arrays containing 12,500 human genes were used for the 2 lung squamous cancer cell lines. HGU 133 plus 2 arrays with >55,000 probes for analysis of >47,000 human transcripts were used for profiling the 4 cervical cancer cell lines. For the remaining 14 cell lines, we used GeneChip Human Genome U133A Arrays containing >22,000 probesets for analysis of >18,400 transcripts, which include
14,500 well-characterized human genes. Probe preparation and hybridization were performed following manufacturer's instructions. Digitized image data were processed using the GeneChip software (version 3.1) available from Affymetrix.
Analysis of expression data. We computed gene expression summary values for Affymetrix GeneChip data using the bioconductor package (which uses background adjustment, quantile normalization, and summarization; ref. 14). Raw data quality was assessed using intensity plots and RNA degradation plots (data not shown). In a second stage, the retained data sets for each cell line of each cancer type were normalized using the MAS5 algorithm (Affymetrix software). We also normalized among the cell lines of each cancer type and among cell lines of all cancer types analyzed (data not shown).
We performed at least two replicates for each cell lines. The expression calls "P" (present), "M" (marginal), and "A" (absent) were determined according to the Affymetrix Array Suite software package. P in the 5-aza-dC treatment data sets was assigned a score of 1 (P-score), and A in the nontreatment data sets was assigned a score of 1 (A-score). For each probe/gene, the expression score was calculated as the sum of the P-score and A-score. Only genes represented by probes with at least one reactivation event (A before treatment to P after treatment) are selected. We then used the previously published algorithm to select candidate genes (12) modified by further selection of promoters with structural and sequence similarities to genes empirically found to be methylated. Brief descriptions of this approach are describe below.
BROAD analysis: genome-wide promoter alignment. The Database of Transcription Start Sites (DBTSS)9 mapped each sequence on the human draft genome sequence to identify its transcriptional start site, which provides us with more detailed information on distribution patterns of transcriptional start sites (TSS) and adjacent regulatory regions. From
14,500 well-characterized human genes present in the Affymetrix GeneChip Human Genome U133A Arrays, we extracted 8,793 sequences from the DBTSS (version 3.0 based on human assembly build 31; refs. 15, 16). The remaining genes (14,500 – 8,793 = 5,707) on the Affymetrix array contained no reported TSS according to DBTSS. Subsequently, Newcpgreport (17) was used to identify CGIs [a CGI is defined as a region of minimal 200 bp, a GC content larger than 50%, and the CpGobserved/CpGexpected (O/E) ratio is >0.60; ref. 18]. These conditions are slightly less stringent than the one proposed by Jones et al. (19). We justified these approaches because we worked using experimentally established and verified gene promoter regions (regions that are closely associated with gene expression) instead of applying the criteria to a genome-wide scan. This resulted in a sequence set of 4,728 genes that were complemented with a set of 56 reported/known cancer-specifically methylated genes chosen from published articles or our data10 (Supplementary Table S2). Of the 4,728 sequences used for clustal alignment, 245 were found to show a given minimal homology to the 56 known genes methylated in cancer but not in normal tissues. We then excluded 132 genes that did not pass the reactivation filter or were already reported to be cancer-specifically methylated, leaving 113 genes (245–132), which we validated by laboratory experimentation.
DEEP analysis: specific binding patterns. Apart from a broad promoter alignment, we sought to determine if there were shorter patterns lost in global alignment (BROAD) associated with known cancer-specific methylation. Therefore, the second (DEEP) part of the computational promoter analysis focused on identification of a discriminating sequence feature between two different functional classes (A and B) of CGI-containing promoters. Class A lists genes that are only methylated in cancer and not in normal, whereas class B enumerates genes that are at least partially methylated in normal (predominantly imprinted genes) tissues (Supplementary Table S3). For each of these genes, we extracted a symmetrical region of 1 kb around the predicted TSS using the DBTSS database (15, 16), and the same definition for CGI was used as for the BROAD analysis. No significant differences in either starting position, GC content, length, or O/E ratio were found for CGIs of genes belonging to class A and class B.
We looked exhaustively for patterns using the Teiresias algorithm (20, 21) with a minimum of 7 nonwild card nucleotides (L) and a maximal length between two nonwild cards of 9 nucleotides (W) that are present in at least 25% of the sequences for each class (A and B). In the next step, we applied different machine learning techniques (22) to extract those patterns for which the frequencies allowed for a discrimination between classes A and B. The following seven motifs (GGGC*GC*C, GCC*GCAC, CTGGG*GA, CCC**GCGCC, AGCTG**CT, A*GGC*GGG, and A*CGC*GCC) were found to be overrepresented in class A versus class B. Using this set of 7 motifs, we identified 261 genes from 8,793 genes extracted from DBTSS. Finally, we ruled out 191 genes (70 remaining) that did not pass the reactivation filter or were already reported cancer-specific methylated genes.
A total of 10 genes passes both (BROAD and DEEP) sequence filter. Excluding the 25 known cancer-specific methylated genes, a total of 175 genes were tested by laboratory experimentation that passes both sequence and reactivation filters. The list of 25 previously reported methylated genes details in Supplementary Table S4.
Tissue samples and DNA extraction. We evaluated tissue samples from 13 different types of primary cancers (a total of 300 human samples). Tissue samples from 106 age-matched individuals without a history of malignancy were used as controls.
Tissue samples were microdissected to isolate >70% epithelial cells in both neoplastic and nonneoplastic tissues. DNA was prepared as described previously (23).
Bisulfite genomic sequence analysis, conventional MSP, and QMSP. Bisulfite sequence analysis was performed to determine the methylation status in cell lines and a limited number of tissues including primary tumors and age-matched normal controls from the same organ. Bisulfite modification of genomic DNA was carried out as described previously (24) and was amplified for the 5' region that included at least a portion of the CGI within 1 kb of the proposed TSS using primer sets (Supplementary Table S5). PCR products were gel purified using the QIAquick Gel Extraction kit (Qiagen) according to the manufacturer's instructions. Each amplified DNA sample was sequenced by the Applied Biosystems 3700 DNA analyzer using nested, forward, or reverse primers and BD terminator dye (Applied Biosystems). When necessary, MSP primers were designed to amplify methylated or unmethylated DNA.
Bisulfite-modified DNA was used as template for fluorescence-based QMSP, as previously described (24, 25). Primers and probes were designed to specifically amplify the promoters of the eight genes of interest and the promoter of a reference gene, actin B (ACTB). Primer and probe sequences and annealing temperatures are provided in Supplementary Table S6. The relative level of methylated DNA for each gene in each sample was determined as a ratio of MSP for the amplified gene to ACTB and then multiplied by 1,000 for easier tabulation (average value of triplicates of gene of interest/average value of triplicates of ACTB x 1,000). The samples were categorized as unmethylated or methylated based on detection of methylation above a threshold set for each gene. This threshold was determined by analyzing the levels and distribution of methylation (if any) in normal (nonneoplastic) age-matched tissues and by maximizing the sensitivity and specificity.
Reverse transcription-PCR and real-time reverse transcription-PCR. Reverse transcription-PCR (RT-PCR) was performed as described previously (26). One microliter of each cDNA was used for real-time RT-PCR using QuantiFast SYBR Green PCR kit (Promega). Amplifications were carried out in 384-well plates in a 7900 Sequence Detector System (Perkin-Elmer Applied Biosystems). Expression of genes relative to glyceraldehyde-3-phosphate dehydrogenase (GAPDH) was calculated based on the threshold cycle (Ct) as 2–
(
Ct), where
Ct = Ct,GENE – Ct,GAPDH and
(
Ct) =
Ct,M –
Ct,Aza (M, mock treatment; Aza, 5-Aza-dC treatment). Detailed PCR conditions and primer sequences are available upon request.
| Results |
|---|
|
|
|---|
|
|
|
50% methylated CpG sites in the CGI in contrast to 10% to 20% in previous algorithm (12, 27). Promoter hypermethylation in normal and primary tumor tissues. To determine if the methylated genes in cancer cell lines were cancer specific, we investigated promoter methylation in a limited number (n = 10–15 for tumors; n = 2–12 for normals) of various primary tumors and age-matched normal tissues by bisulfite sequence analysis, COBRA, and/or MSP (Supplementary Table S8). Out of 82 genes that showed methylation in cell lines, promoter methylation was detected in 53 (65%) genes in primary tumor tissues. After testing corresponding age-matched normal tissues, 28 of these genes were identified to be methylated in a cancer-specific manner. Thus, 28 of 175 (16%) new cancer-specific methylated genes were identified through our combination of a computational approach and empirical studies. We used age-matched normal tissue as a control. If the frequency of methylation is higher in cancer and absent or lower level/frequency in normal tissue at an optimal cutoff, we considered it as a cancer-specific methylation. A summary of our analysis of all 200 genes is detailed in Table 1 .
|
|
|
New targets of aberrant methylation in major types of cancer by QMSP. We noted that some of the cancer-specific methylated genes were reactivated and methylated in more than one type of cell line. To determine the frequency of methylation in a larger set of samples and in multiple cancer types, we selected 8 of the most frequently cancer-specific methylated genes from our list of newly identified 28 genes and developed a QMSP assay. We found cancer-specific methylation at various frequencies for each gene in multiple types of cancer (Table 3 ). A high frequency of cancer-specific methylation for at least one gene was identified in every cancer type, supporting the notion that methylated genes are likely to play a role across multiple cancer types.
| Discussion |
|---|
|
|
|---|
We found that 47% (82 of 175) of the genes tested in cell lines were methylated by bisulfite sequencing and/or MSP, and 65% (53 of 82) of these genes were methylated in primary tumors. Our results are consistent with previous studies (12, 26, 27), where the frequency of methylation of any particular gene in primary tumors is generally less than that observed in cell lines. The discrepancy between the computationally and pharmacologically predicted (175) and experimentally (82) identified methylated genes in cell lines may be partially due to the analysis of limited regions (
200–300 bp for most of the genes) by bisulfite sequencing or MSP.
To compare the overall pattern of methylated CGIs among tumors, we tested 300 primary tumors of 13 different types with 8 frequently cancer-specific methylated genes identified from our approach. Pancreas, gastric, thyroid, and ovary cancers displayed relatively low levels of methylation. Colon, prostate, esophagus, and kidney tumors, however, displayed a much higher frequency of methylation overall. Some tumors within a type displayed high inherent levels of methylation, whereas others within the same tumor type displayed low levels (data not shown). The data are not consistent with chance variation from tumor to tumor because in the absence of heterogeneity, the variance of the methylation frequency would not be expected to be greater than the mean. Therefore, aberrant methylation of CGIs can be quantitatively different in individual tumors within a tumor type and more pronounced in particular tumor types.
We found cancer-specific and tissue-specific methylation events in different tissue types. For example, PAK3 cancer-specific methylation was found in esophagus, lung, cervix, head and neck, and bladder cancers with high frequency. PAK3 was also occasionally methylated in other normal tissues. PAK3 is located in the X chromosome; thus, it is likely that there will always be methylated signal in samples from female patients. However, we consider PAK3 as cancer-specific methylation as we also found high frequency of methylation in samples from male cancer patients. Like PAK3, some other genes showed either cancer-specific or tissue-specific methylation in multiple organs (Table 3). Although there have been reports of MCAM overexpression in melanoma, we found a high frequency of MCAM promoter methylation in prostate cancer. Oncostatin M receptor (OSMR) showed cancer-specific methylation only in colon cancer and was previously shown to have a major functional role in breast and other cancers (30, 31). Liang et al. (32) reported loss of expression of SSBP2 in 50% of myeloid leukemia cell lines and concluded that loss of SSBP2 expression may underlie the impaired differentiation seen in human myeloid leukemia. However, before this report, there was no reported mechanism for loss of expression of this DNA-binding protein. β4GalT-1 is constitutively expressed in all tissues, with the exception of the brain (33), as a Golgi-resident protein. We found a high frequency of cancer-specific methylation of β4GalT-1 in esophagus, lung, colon, and prostate. NISCH [imidazoline receptor antisera selected (IRAS)] was first isolated as an imidazoline-1 receptor candidate cloned by an IRAS cDNA approach (34) and was independently shown to be an interacting partner for insulin receptor substrate 4 (35). IRAS was recently reported to protect transfected PC12 cells from apoptosis (36, 37), whereas its mouse homologue, Nischarin, which lacks the NH2-terminal PX domain, was identified as a cytosolic-interacting protein for
5 integrin and shown to inhibit cell migration by inhibiting the ability of PAK1 to phosphorylate substrates (37, 38). We found a high frequency of cancer-specific methylation of this gene in lung, head and neck, and gastric cancer. KIF1A is a member of the KIF1/Unc104 family, and targeted deletion of the KIF1A gene in mice causes accumulation of clear small vesicles in the cell body of neurons as well as marked neuronal death (39). We report for the first time a high frequency of cancer-specific methylation of KIF1A in majority of human tumors.
The frequency of methylation within a tumor type of the individual CGIs affected in at least three different tumor types is shown (Table 3). Some targets were methylated at a high frequency in one tumor type but infrequently in others (e.g., OSMR; Table 3), whereas other targets (e.g., KIF1A) were methylated at relatively high frequencies in the majority of tumor types. Thus, whereas some CGIs targets are shared by multiple tumor types, others are methylated in a tumor-type–specific manner. It has been documented that virtually all biochemical, biological, and clinical attributes are heterogeneous within human cancers of the same histologic subtypes (40). Our data suggest that differences in the methylated genes in various tumors could account for a major part of this heterogeneity.
Like any global genomic and epigenomic approach, our study has limitations. First, we were not able to test all the known and newly discovered methylated genes in all the 13 types of cancer included in this study. Second, although mosaic methylation occurred in most of the cases, focal methylation for some genes was also reported, and methylation in 5' untranslated regions would not be detectable by the methods we used. Future studies using a combination of different technologies will be able to address these issues.
The results of this study inform future cancer methylome discovery effort in several important ways:
Adding these data to previous reports, perhaps up to one third (
300 genes total) of the cancer methylome has now been discovered, compared with the identification of perhaps 200 mutated genes over the past 2 decades and recent genome-wide mutation analysis in primary tumors (41). An emerging picture of genetic and epigenetic changes and their relationship is unraveling the biological networks responsible for human cancer. The genetic and epigenetic alterations in different cancer types are diverse (42, 43), and we and others previously found unique inverse relationships between genetic/epigenetic changes (27, 44, 45). However, 26 genes obtained in the Vogelstein's last mutation screening are also methylated in our study (41, 46). Ultimately, the epigenome of all cancer tissues will be mapped out even as we now approach a total molecular signature of cancer. According to Dr. Peter Jones (as reviewed in ref. 47), each differentiated cell has a different epigenome. Our comprehensive analysis contributes greatly to the emerging epigenomic map of DNA methylation in the human genome. Additional studies using similar and complementary genomic strategies should yield further insights into the dynamics and hierarchy of epigenetic regulation during tumorigenesis. These data define the epigenetic landscape of major human cancer types, provide new targets for diagnostic and therapeutic intervention, and open fertile avenues for basic research in tumor biology.
| Acknowledgments |
|---|
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
| Footnotes |
|---|
M.O. Hoque, M.S. Kim, K.L. Ostrow, J. Liu, and G.B.A. Wisman contributed equally to this study.
9 http://elmo.ims.u-tokyo.ac.jp/dbtss/ ![]()
Received 10/17/07. Revised 1/16/08. Accepted 2/ 7/08.
| References |
|---|
|
|
|---|
5 subunit is important for its interaction with nischarin. Biochem J 2004;377:449–57.[CrossRef][Medline]This article has been cited by other articles:
![]() |
N. Yang, J. J.H. Eijsink, A. Lendvai, H. H. Volders, H. Klip, H. J. Buikema, B. M. van Hemel, E. Schuuring, A. G.J. van der Zee, and G. B. A. Wisman Methylation Markers for CCNA1 and C13ORF18 Are Strongly Associated with High-Grade Cervical Intraepithelial Neoplasia and Cervical Cancer in Cervical Scrapings Cancer Epidemiol. Biomarkers Prev., November 1, 2009; 18(11): 3000 - 3007. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Brait, J. G. Ford, S. Papaiahgari, M. A. Garza, J. I. Lee, M. Loyo, L. Maldonado, S. Begum, L. McCaffrey, M. Howerton, et al. Association between Lifestyle Factors and CpG Island Methylation in a Cancer-Free Population Cancer Epidemiol. Biomarkers Prev., November 1, 2009; 18(11): 2984 - 2991. [Abstract] [Full Text] [PDF] |
||||
![]() |
P.-K. Lo, H. Watanabe, P.-C. Cheng, W. W. Teo, X. Liang, P. Argani, J. S. Lee, and S. Sukumar MethySYBR, a Novel Quantitative PCR Assay for the Dual Analysis of DNA Methylation and CpG Methylation Density J. Mol. Diagn., September 1, 2009; 11(5): 400 - 414. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. L. Ostrow, H. L. Park, M. O. Hoque, M. S. Kim, J. Liu, P. Argani, W. Westra, W. V. Criekinge, and D. Sidransky Pharmacologic Unmasking of Epigenetically Silenced Genes in Breast Cancer Clin. Cancer Res., February 15, 2009; 15(4): 1184 - 1191. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Tessema and S. A. Belinsky Mining the Epigenome for Methylated Genes in Lung Cancer Proceedings of the ATS, December 1, 2008; 5(8): 806 - 810. [Abstract] [Full Text] [PDF] |
||||
![]() |
J.-W. Liu, J. K. Nagpal, W. Sun, J. Lee, M. S. Kim, K. L. Ostrow, S. Zhou, C. Jeronimo, R. Henrique, W. Van Criekinge, et al. ssDNA-Binding Protein 2 Is Frequently Hypermethylated and Suppresses Cell Growth in Human Prostate Cancer Clin. Cancer Res., June 15, 2008; 14(12): 3754 - 3760. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| Cancer Research | Clinical Cancer Research |
| Cancer Epidemiology Biomarkers & Prevention | Molecular Cancer Therapeutics |
| Molecular Cancer Research | Cancer Prevention Research |
| Cancer Prevention Journals Portal | Cancer Reviews Online |
| Annual Meeting Education Book | Meeting Abstracts Online |