
Cancer Research 66, 10621-10629, November 1, 2006. doi: 10.1158/0008-5472.CAN-06-1687
© 2006 American Association for Cancer Research
Epidemiology and Prevention |
Examination of a CpG Island Methylator Phenotype and Implications of Methylation Profiles in Solid Tumors
Carmen J. Marsit1,
E. Andres Houseman2,
Brock C. Christensen1,
Karen Eddy1,
Raphael Bueno4,
David J. Sugarbaker4,
Heather H. Nelson3,
Margaret R. Karagas5 and
Karl T. Kelsey1
Departments of 1 Genetics and Complex Diseases, 2 Biostatistics, and 3 Environmental Health, Harvard School of Public Health; 4 International Mesothelioma Program, Division of Thoracic Surgery, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts; and 5 Department of Community and Family Medicine, Section of Biostatistics and Epidemiology, Dartmouth Medical School, Lebanon, New Hampshire
Requests for reprints: Karl T. Kelsey, Department of Genetics and Complex Diseases, Harvard School of Public Health, 665 Huntington Avenue, Boston, MA 02115. Phone: 617-432-3313; Fax: 617-432-0107; E-mail: kelsey{at}hsph.harvard.edu.
 |
Abstract
|
|---|
The CpG island methylator phenotype (CIMP), thoroughly described in colorectal cancer and to a lesser extent in other solid tumors, is important in understanding epigenetics in carcinogenesis and may be clinically useful for classification of neoplastic disease. Therefore, we investigated whether this putative phenotype exists in exposure-related solid tumors, where somatic gene alterations and enhanced clonal growth are selected for by carcinogens, and examined the ability of methylation profiles to classify malignant disease. We studied promoter hypermethylation of 16 tumor suppressor genes and 3 MINT loci (acknowledged classifiers of CIMP) in 344 bladder cancers, 346 head and neck squamous cell carcinomas (HNSCC), 146 nonsmall-cell lung cancer (NSCLC), and 71 malignant pleural mesotheliomas (MPM). We employed rigorous statistical methods to examine the distribution of promoter methylation and the usefulness of these profiles for disease classification. In bladder cancer, HNSCC, and NSCLC, there was a significant correlation (P < 0.0001) between methylation of the three MINT loci and methylation index, although the distribution of methylated loci varied significantly across these disease. Although there was a significant (P < 0.001) association between gene methylation profile and disease, rates of misclassification of each disease by their methylation profile ranged from 28% to 32%, depending on the classification scheme used. These data suggest that a form of CIMP exists in these solid tumors, although its etiology remains elusive. Whereas the gene profiles of hypermethylation among examined loci could not unequivocally distinguish disease type, the existence of CIMP and the relative preponderance of hypermethylation in these cancers suggest that methylation analysis may be clinically useful as a targeted screening tool. (Cancer Res 2006; 66(21): 10621-9)
 |
Introduction
|
|---|
Recent data addressing the contribution of epigenetic alterations to human cancer suggest that tumors arise from polyclonal fields of epigenetically altered stem/progenitor cells, and that at least some of the fundamental heterogeneity of tumor cells may be due to epigenetic variation in these progenitors (1). It is now well accepted that both genetic and epigenetic alterations drive multistage carcinogenesis and, consequently, it is crucial to understand the underlying etiology of these alterations to better treat, detect, and prevent sporadic cancer. The most commonly studied of these epigenetic alterations is promoter hypermethylation, the methylation of cytosines within CpG dinucleotides in the context of promoter region CpG islands associated with gene silencing.
In colorectal cancer, a CpG island methylator phenotype (CIMP) has been described and characterized by (i) a high degree of methylation of CpG island regions in both gene promoter and noncoding region contexts and (ii) an association with microsatellite instability due to the methylation-silencing of the mismatch repair gene MLH1 (2). A CIMP also has been described in brain tumors (3, 4), gastric cancer (5), liver cancer (6), T-cell acute lymphoblastic leukemia (7), pancreatic tumors (8), and ovarian carcinoma (9), among others (10). However, the methylated genes used to define this phenotype have been inconsistent and the number of cases of cancer studied is often small. Further, some reports have questioned the existence of this phenotype (11). Issa (10) has suggested examination of promoter methylation of MINT1, MINT2, MINT31, CDKN2A, and MLH1 in colon cancer to define the phenotype, but it remains unclear if this same set of loci should be applied to all tumor types to detect the presence or absence of this phenotype. Additionally, delineation of the existence or absence of this phenotype in tumors from other tissues (and any correlation of the phenotype with clinical features of the disease) will require large, population-based samples that are devoid of bias arising from nonrandom selection.
Specific gene promoter hypermethylation has been described in lung, oral, and bladder cancer and malignant mesothelioma, but little has been done to examine if a CIMP exists in these tumors. Understanding the profile of promoter hypermethylation of a number of common genes and loci across these diseases allows for a direct comparison of CIMP in these diseases. Further, this approach could determine if such tissue-specific profiles may be clinically useful as biomarkers of these diseases. Together, these tumors contribute significantly to cancer mortality around the world, and early detection is critical to successful treatment and positive outcomes. As technology allows for identification of malignant cells in body fluids such as peripheral blood, using biomarkers such as promoter methylation profiles may be valuable in pinpointing the tissue of origin of the tumor.
We have combined data from studies of nonsmall-cell lung cancer (NSCLC), head and neck squamous cell carcinoma (HNSCC), bladder cancer, and malignant mesothelioma to examine the profile of promoter hypermethylation of 19 loci, examining the distribution of promoter methylation across diseases and determining both the evidence for CIMP and the potential diagnostic use of these profiles. We have used a number of statistical approaches to examine the distribution of hypermethylation within and across these diseases, beginning first with an examination of CIMP using the previously defined markers of the phenotype in colon cancer. We then further examine the interdependence of promoter hypermethylation between gene loci. Finding marked correlation among loci and highly variable prevalence of promoter hypermethylation between different loci, we employ novel statistical approaches to examine whether the distribution of promoter hypermethylation can be described by clustering across diseases, by examining latent classes within diseases, and finally by using an item response theory approach to model the latent propensity for hypermethylation within each of these diseases.
 |
Materials and Methods
|
|---|
Sample acquisition. Study design and patient accruement for the NSCLC (12), bladder cancer (13), HNSCC (14), and mesothelioma (15) case series have previously been described. In each study, subjects provide written informed consent according to institutional review board protocols. Archived pathology specimens were requested on all cases enrolled in the parent studies. In this study, samples were obtained and DNA was successfully extracted and modified from 346 HNSCC, 344 bladder cancer, 146 NSCLC, and 71 malignant mesothelioma cases.
DNA extraction and sodium bisulfite modification. All tumor samples used were histologically confirmed by an independent pathologic review. The proportion of malignant cells in the available sections was also observed and the specimens containing a majority of tumor tissue were used in these analyses. For formalin-fixed paraffin-embedded tumors, three 20-µm sections were cut from each tumor sample and transferred into microcentrifuge tubes. The paraffin was dissolved with Histochoice Clearing Agent (Sigma-Aldrich, St. Louis, MO), and then washed twice with 100% ethanol and once with PBS. The samples were then incubated in SDS-lysis solution [50 mmol/L Tris-HCl (pH 8.1), 10 mmol/L EDTA, 1% SDS] with proteinase K (Qiagen, Valencia, CA) overnight at 55°C. De-crosslinking was done by adding NaCl (final concentration, 0.7 mol/L) and incubating at 65°C for 4 hours. DNA was recovered using the Wizard DNA clean-up kit (Promega, Madison, WI) according to the protocols of the manufacturer. Fresh tumor samples were extracted essentially identically, except the paraffin removal steps were not used. Sodium bisulfite modification of the DNA was done with the EZ DNA Methylation Kit (Zymo Research, Orange, CA) following the protocol of the manufacturer, with the addition of a 5-minute initial incubation at 95°C before addition of the denaturation reagent. The de-crosslinking incubation as well as the 95°C incubation ensures more complete melting of the DNA and thus more complete sodium bisulfite conversion, particularly for the formalin-fixed specimens.
Methylation-specific PCR. We have specifically chosen to use traditional methylation-specific PCR for the analysis of promoter hypermethylation in these studies, as we have done matched analysis between fresh-frozen and formalin-fixed paraffin-embedded samples and found the greatest concordance (>95%) for methylation detection using this method. We have also previously examined potential biases in the sensitivity of using this assay against the relative quantitative TaqMan-based methods (13) and found no evidence for potential bias based on tumor quantity in the samples analyzed. Finally, this method allows for detection of a large number of genes from the limited DNA samples available on many of these tumors, whereas the quantitative assays require larger DNA quantities for the multiple amplifications of specific genes and reference genes.
Sodium bisulfitemodified DNA was used as the template for methylation-specific PCR as previously described (16) using primers specific for the methylated promoters of CDKN2A (16), RASSF1A (17), APC (18), PYCARD (19), LAMC2 (20), SFRP1, SFRP2, SFRP4, SFRP5 (21), GSTP1, MGMT, DAPK, RARB, CDH1 (22), CDH13 (23), MLH1, MINT1, MINT31 (24), and MINT32 (25). All methylation-specific PCRs were optimized to detect >5% methylated substrate in each sample. To control for the presence of modified DNA, primers specific to a modified region of the ACTB genes containing no CpG sites were used (26). Modified circulating blood lymphocyte DNA (obtained from a control subject) and the same lymphocyte DNA completely methylated using SssI DNA methylase and modified by treatment with sodium bisulfite were used as the negative and positive controls, respectively, in each run.
Statistical analysis. Prevalence of the promoter hypermethylation of each of the examined loci was tabulated by disease and a
2 test was done to examine the difference in the prevalence of methylation across the disease classes. Because there were numerous zero counts in the cross-classification of disease by hypermethylation, we also did a permutation test of association using the
2 statistic.
Methylation index (27) was calculated by taking the sum of gene promoter hypermethylation events in each tumor divided by the number of genes examined. The distribution of the methylation index was plotted and the difference in the distribution across disease classes was examined using a quasi-likelihood approach, with a log link function and binomial variance as is appropriate for bounded discrete outcomes such as the methylation index. To examine the possible association between methylation index and MINT loci methylation, we again used the quasi-likelihood approach described above, modeling methylation index as the outcome with the sum of the number of MINT loci methylation events as the predictor. We also fit a more general model with separate MINT effects; however, a score test for the equivalence of the effects produced an insignificant P value, suggesting that the sum of the MINT loci is an adequate summary predictor of hypermethylation.
To examine associations between loci, we employed both descriptive and inferential approaches. We visually examined the Pearson correlation coefficients between loci and, using the hclust function in R (28), did hierarchical clustering using 1 minus the absolute value of the correlation coefficient as a distance metric. We then used several latent variable inferential techniques appropriate for binary data: latent class analysis and item response theory models (29). The latent class model is characterized by
 | (1) |
where Yj is a subject's methylation response for locus j, and U is an unobserved categorical variable taking integer values from 1 to K, where the number of latent classes K is prespecified. Thus, each locus has a class-specific probability of methylation. The prevalence of the classes,
k = P(U = k), must also be estimated, and may be assumed to be uniform across all subjects or, via the regression approach of (30), to vary among different predefined populations (e.g., tumor types).
We fit models with K = 2 to 6 classes, assuming either uniform prevalence
k across all four tumor types or tumor-specific prevalences. Item response theory models, also known as latent trait models, can be thought of as a limiting case of latent class models when the number of latent classes is large (31). The item response theory model is characterized by the following logistic model:
 | (2) |
where U is an unobserved normally distributed variable, with unit variance and mean that may depend linearly on a vector Z of covariates such as tumor type indicators: E(U) = Zt
.
We used two methods to predict tumor type from methylation events: Classification and Regression Trees (CART; ref. 32), implemented in R (28) through the tree package (33) using the default splitting-rule parameters, and polytomous (multinomial) logistic regression (34), implemented in R (28). Both methods were used to model the probability that a random categorical variable Y takes on a particular value c, where c is one of a finite number of known categories (c = 1,..., C), conditional on an observed vector of values x associated with Y. CART is a nonparametric method that partitions the domain of x into regions R using successive binary splits; over each region R, the probability
c(R) of response Y = c conditional on x
R is calculated as
 | (3) |
where the data consist of n pairs (xi, Yi), the function I(Q) is the indicator function (equal to 1 if Q is true and 0 otherwise), and the regions are chosen to minimize prediction error for a future value of x.
Polytomous regression is a parametric method that models the conditional probabilities
 | (4) |
where ßc is a category-specific regression variable and ßC = 0 for the reference category C. (Note that the modeled probabilities do not depend on the choice of reference category.) Subjects with missing methylation events were excluded from the latter model. Misclassification rates were estimated using 10-fold cross-validation (32).
 |
Results
|
|---|
We examined the promoter hypermethylation status of the same 16 genes and 3 noncoding CpG island regions (MINTs) in 346 HNSCC, 344 bladder cancer, 146 NSCLC, and 71 malignant pleural mesotheliomas (MPM) using methylation-specific PCR. We examined this group of genes, first because promoter hypermethylation detected using this method has previously been shown to be correlated to transcriptional silencing of these genes. Second, these genes have all previously been shown to have tumor-specific promoter hypermethylation. Third, we wished to examine the silencing of tumor suppressor genes involved in a variety of cellular processes and pathways, and these genes are known to be involved in processes including cell cycle control (CDKN2A, RASSF1A, and APC), apoptosis (DAPK and PYCARD), extracellular interactions (LAMC2), transcriptional regulators (RARB), WNT signaling (SFRP family), detoxification (GSTP1), cell-cell signaling (CDH1 and CDH13), and DNA repair (MGMT and MLH1). The prevalence of promoter hypermethylation of each of the genes or loci across these diseases is provided in Table 1
. The overall difference in the prevalence of promoter hypermethylation at the tumor suppressor genes examined (excluding the MINT loci) across tumor types was examined using the
2 test statistic to assess if the prevalence of gene promoter hypermethylation across all loci differed by tissue (Table 1). This analysis showed a statistically significant permutation test P value of <0.001 indicating a significant association between tumor type and locus-specific prevalence of promoter hypermethylation.
We began our examination of CIMP by first examining the overall distribution of hypermethylation and any association this distribution may have to markers of CIMP previously described, particularly hypermethylation of the MINT loci as methylation status of the MINT loci has been proposed as an objective approach to defining a methylator phenotype (10). As mentioned, we calculated methylation index, a count of the number of genes methylated (excluding the MINT loci) in each tumor, and Fig. 1
displays the distribution of the methylation index using box-plots depicting the median, interquartile ranges, and 5th and 95th percentiles by disease. The MINT loci were excluded from the calculation of the methylation index, as these regions are not associated with any known gene coding, and thus there may be a different mechanism operative in the selection for their methylation. It is perhaps a mechanism related to a methylator phenotype or an increased propensity for hypermethylation in a more general sense. The distribution of the methylation index (Fig. 1) is significantly different across diseases (score test, P < 0.0001), and the magnitude of the methylation index differences was examined using the quasi-likelihood approach discussed in Statistical analysis, comparing the magnitude of the difference in the methylation index between the tumor types. These results show that NSCLC has a greater methylation index than the other diseases, with an estimate suggesting that NSCLC has a 2.3 times greater methylation index [95% confidence interval (95% CI), 2.0-2.6] compared with bladder cancer, whereas MPM has a lower methylation index compared with the reference, bladder cancer (estimate, 0.8; 95% CI, 0.6-1.0), and HNSCC has a similar methylation index compared with bladder cancer (estimate, 1.3; 95% CI, 1.1-1.4). The prevalence of hypermethylation of these three MINT loci across the four tumor types is shown in Table 1. To examine if the prevalence of MINT loci hypermethylation was greater in those tumors with a higher methylation index, we examined the difference in the distribution of methylation index in each tumor type by a count of the number of the MINT loci methylated in that tumor, and this analysis is depicted in Fig. 2
. MINT loci methylation was significantly associated with increasing methylation index in bladder cancer (estimate, 1.6; 95% CI, 1.4-1.7; P < 0.0001), HNSCC (estimate, 1.4; 95% CI, 1.3-1.5; P < 0.0001), and NSCLC (estimate, 1.2; 95% CI, 1.1-1.3; P < 0.0001). On the other hand, there was no significant relationship between methylation index and MINT loci hypermethylation in MPM (estimate, 1.1; 95% CI, 0.8-1.4; P < 0.6). To ensure that the promoter hypermethylation we are observing is related to disease and not aging, we also examined whether there was any relationship between age of patient at diagnosis and methylation index and found no significant association between methylation index and age in any of the diseases (data not shown).

View larger version (8K):
[in this window]
[in a new window]
|
Figure 1. Box-plots of the methylation index in bladder cancer, HNSCC, NSCLC, and MPM. Y axis, methylation index; X axis, individual diseases. Boxes, interquartile range of the distribution (25th-75th percentile); horizontal line in the box, median; vertical lines, 5th and 95th percentiles; solid circles, outliers. Below the box-plot the mean and SD, in parentheses, are given. The distribution of the methylation index across diseases was examined using a quasi-likelihood approach, with a logit link function and binomial variance as is appropriate for bounded discrete outcomes such as the methylation index. The score test indicated that the distribution of methylation index was significantly different across disease classes (P < 0.0001).
|
|

View larger version (10K):
[in this window]
[in a new window]
|
Figure 2. Box-plots of the methylation index by number of MINT loci hypermethylated in bladder cancer (A), HNSCC (B), NSCLC (C), and MPM (D). Y axis, methylation index; X axis, number of MINT loci with hypermethylation in the individual diseases. Boxes, interquartile range of the distribution (25th-75th percentile); horizontal line in the box, median; vertical lines, 5th and 95th percentiles; solid circles, outliers. To examine associations between methylation index and MINT loci methylation, we used the quasi-likelihood approach, with a logit link function and binomial variance as is appropriate for bounded discrete outcomes such as the methylation index. We modeled the sum of the number of MINT loci methylation events as the predictor. Score tests indicated that the distribution of methylation index across the sum of MINT loci hypermethylation was significantly different in bladder cancer (P < 0.0001), HNSCC (P < 0.0001), and NSCLC (P < 0.0001), but not in MPM (P < 0.6).
|
|
Although the methylation index has previously been employed in numerous reports, such an analysis may be inappropriate for examining the propensity for hypermethylation for two specific reasons. First, as shown in Table 1, the prevalence of hypermethylation of the individual gene promoters is not uniform. Second, as can be seen in Fig. 3B
, there is, however, some degree of correlation between hypermethylation at these different loci because the majority of absolute values of these correlations are <0.5, suggesting that hypermethylation of many of these loci does not occur as an independent event. The dendrogram in Fig. 3A showed limited clustering of these methylation events, although hypermethylation of the promoters of APC, MLH1, GSTP1, RASSF1A, and PYCARD shows the closest relationships and may be distinct from the other gene promoters examined.

View larger version (38K):
[in this window]
[in a new window]
|
Figure 3. A, examination of the correlation between gene promoter hypermethylation at the various loci using hierarchical clustering. B, the heat map represents the absolute value of the Pearson correlation coefficient and the dendrogram, which uses 1 minus the absolute value of the correlation coefficient as a distance metric to drive the hierarchical clustering, shows the relationship between alterations at the gene promoters.
|
|
To more specifically use the pattern of promoter hypermethylation to distinguish these tumor types from one another, thereby extending the cluster analysis to include tumor type, we employed two methods: multinomial logistic regression (with the disease as the outcome variable and gene promoter hypermethylation as the predictor), followed by a CART approach that uses binary trees to examine classification. The decision tree obtained from the CART approach is depicted in Fig. 4A
and a depiction of the multilogit estimate obtained for each gene methylation from the multinomial logistic regression is depicted in Fig. 4B. The CART analysis shows that hypermethylation at specific loci is somewhat predictive of tumor type and that there is a general tendency for bladder cancer to be associated with less methylation than NSCLC. This analysis also suggests that hypermethylation of CDH1, SFRP2, CDH13, APC, SFRP4, LAMAC2, and SFRP5 is the most predictive in the classification scheme. Figure 4B, which displays estimates of the multinomial logit effects ßc, along with 95% CIs, also suggests locus-specific associations with tumor type. However, these results also show that there is not a simple prediction of the disease classification based on individual gene methylation profiles. In this model, for example, CDH1 seems to have a significant ability to differentiate bladder cancer from the three other tumor types (as the estimates and confidence limits calculated for HNSCC, NSCLC, and MPM do not overlap with the reference) but cannot distinguish significantly between these three tumor types (the estimates and CIs overlap). APC methylation, on the other hand, seems to significantly distinguish HNSCC from bladder as well as NSCLC and MPM, but cannot distinguish significantly between NSCLC and MPM, which would be considered the most clinically useful. For such specific classification, from this gene set, RARB seems to show the most significant ability to distinguish between NSCLC and MPM. The error rates of the CART and multinomial logistic models were calculated, employing a 10-fold cross-validation approach to determine their misclassification rates. The CART model had a misclassification rate of 0.34, with a multinomial logistic regression of 0.28.

View larger version (17K):
[in this window]
[in a new window]
|
Figure 4. Prediction of disease classification by tumor suppressor gene hypermethylation. A, the nonparametric CART successive binary splits are depicted with each horizontal line representing hypermethylation status of a tumor suppressor gene, with 0 representing negative for hypermethylation and 1 positive for hypermethylation, with the predicted disease classification given at the bottom of the tree. B, estimates and 95% CIs obtained from the parametric polytomous regression approach are depicted on the Y-axis for individual genes (X-axis) in HNSCC (H), NSCLC (L), and MPM (M). Estimates are relative to bladder cancer, which serves as the reference (0; black line). The vertical lines at GSTP1 for HNSCC and at MLH1 for MPM represent multilogit estimates and 95% CIs that span beyond the scale of the graph. Misclassification rates were determined using 10-fold cross-validation and were 0.34 for the CART approach and 0.28 for the polytomous regression approach.
|
|
As we recognize from the correlations depicted in Fig. 3, use of the methylation index to examine the existence of CIMP in these diseases is statistically inappropriate. Previous reports of CIMP suggest that the distribution of the count of gene hypermethylation events takes on a bimodal distribution, thereby suggesting the existence of two specific classes of tumors, those that are CIMP positive and those that are CIMP negative. We used latent class models to examine how hypermethylation of these tumors clusters into a small number of classes or profiles. This approach is based on the assumption that K underlying classes exist and that the measured data, in this case promoter hypermethylation, can be used to define the probability of an individual tumor belonging to a specific class. We assumed from K = 2 to K = 6 classes and used the Akaike Information Criterion (3537), a measure of lack-of-model-fit, to determine which of these models best fits the data. We found that for every assumed value of K, the model that assumed tumor-specific class prevalence had lowered Akaike Information Criterion, suggesting that different disease types have different methylation patterns. Additionally, with the addition of each additional class, the Akaike Information Criterion is reduced (i.e., a two-class model of methylation by disease had an Akaike Information Criterion of 11840, compared with a six-class model that had an Akaike Information Criterion of 11328), suggesting that the best model may in fact be the one with a large number of classes. Consequently, we used a latent trait method, the item response theory model, which, instead of assuming a small number of discrete classes, assumes a single underlying continuous latent trait, which we interpret as the propensity for hypermethylation, and examined the contribution of the promoter methylation of each gene on this underlying propensity. This model produced an Akaike Information Criterion of 10710, suggesting this to be the best fitting model for this data. By assuming that the mean value of the latent trait depends on tumor type, different diseases are allowed to reflect different hypermethylation propensities. The results of this model are depicted in Table 2
, and show that, similar to the initial models of methylation index, NSCLC and, to a lesser extent, HNSCC have a greater propensity for hypermethylation than does bladder cancer, whereas MPM seems to have a reduced propensity for hypermethylation. This model also suggests that CDH13, SFRP1, SFRP2, and SFRP5 have the greatest positive influence on the underlying propensity for hypermethylation whereas APC has little significant effect on the propensity for hypermethylation.
 |
Discussion
|
|---|
This study is unique in that it has taken advantage of four large case series of human cancer to examine the profile of promoter hypermethylation at a number of tumor suppressor genes and loci. Using a population-based approach, such as employed in the bladder and head and neck studies, provides results that are unbiased and therefore generalizable to the comparable population. In using this approach, however, some bias may remain, particularly related to the tumors available for examination. For example, the study of NSCLC tumors is generally limited to resected tumors (primarily stage I and II). Due to the information collected from these patients, it is possible to examine these sources of bias to ensure that the methylation profiles are truly reflective of the tumor and not biased by some characteristic of the tumor, such as size or stage. We have previously shown that our detection of promoter hypermethylation using methylation-specific PCR is not biased in its detection by tumor stage nor by the amount of tumor material present in the samples (13), and we are thus confident that the profiles observed are truly reflective of those in a population of tumor samples for these diseases.
In these four tumor types, there was a strikingly different prevalence, by disease, in the promoter hypermethylation of specific tumor suppressor genes. This was not unexpected as previous examinations have alluded to such a difference (38). Importantly, our work examined hypermethylation of 16 candidate genes in a consistent population of tumors, allowing us to more rigorously examine and test this hypothesis. A simple examination of the difference in the prevalence of methylation across diseases, presented in Table 1, shows a highly statistically significant difference. This suggests that there is a disease-specific selection of the candidate genes being silenced epigenetically, and that these events do not appear in a random fashion. For example, RARB was silenced in 63% of NSCLC and 30% of HNSCC, but only in 15% and 10% of bladder cancers and MPM, respectively. These data suggest that this pathway is particularly targeted for silencing in certain diseases, but not in others, or that the mode of inactivation (e.g., promoter hypermethylation versus gene deletion) occurs differentially by disease. The particular factors responsible for selection remains unclear; it may be the innate differences in the target tissue or the differences in the type, dose, or duration of carcinogen exposure at the different sites, selecting for differential gene inactivation. Carcinogen-specific selection of individual pathways for inactivation is consistent with our prior work (39) and with a recent work from Toyooka et al. (40) studying methylation and mutation in pathways in lung cancer. It should be remembered, of course, that there are some genes of which the silencing is common across tumor types. CDKN2A, encoding the P16INK4A protein, shows promoter hypermethylation relatively consistently across these diseases, suggesting that epigenetic inactivation of this pathway (the Rb cell cycle checkpoint pathway) is common across tumor types.
Although the profiles of the promoter hypermethylation seem to be distinct across these tumor types, they are not definitive enough to be able to correctly or consistently classify tumors. Examination of a variety of measures, including gene expression in the form of microarrays, protein expression, and now even miRNA expression profiles, have been suggested as methods for classifying tumors, although none of these techniques, as yet, have shown prospective clinical use, even with relatively large data sets. Promoter hypermethylation profiles have also been suggested as an alternative to provide this kind of classification. This approach has been touted as favorable because it is based on DNA (as compared with RNA or protein, which are more variable and unstable) and can be done even on limited samples with use of PCR amplification. To explore this, we used two approaches to classify tumors based on promoter methylation profiles: the CART approach and a multinomial logistic regression model. With cross-validation analyses to examine misclassification using these schemes, we saw poor performance of both techniques, with misclassification rates between 28% and 34%. Additional methods, such as neural network approaches, may provide better classification. However, we have attempted other model-free approaches, such as k-nearest neighbor (data not shown; ref. 32), and have observed no improvement in error rates above those with the classification approaches described. Although these results suggest that the multinomial logistic approach has less classification error, the results are more difficult to interpret. The CART approach, which has a slightly greater misclassification rate, has a relatively straightforward decision treelike interpretation; therefore, these approaches are not easily applicable to the clinical setting, where such classification may be useful. Examination of additional loci may also aid in creating more sensitive, powerful, and specific profiles to allow for more efficient classification, particularly if very tissue-specific loci could be identified. Although these results suggest that general classification is error-prone, it may be more valuable to focus on particular classifications that are most clinically useful or that pose the greatest problems for diagnostic pathology, such as distinguishing between MPM and lung adenocarcinoma.
One goal of this study was to examine whether CIMP (2, 41), as has been described most conclusively in colorectal cancer, is a phenotype that may characterize other solid tumors. One oft-encountered problem with defining this phenotype is the lack of a specific definition by which to judge its presence or absence. Issa (10) has suggested that examination of the hypermethylation of the MINT loci, as well as specific gene promoters, such as CDKN2A and MLH1, can be used in this determination. Therefore, we included these loci and genes in our study design and examined the relationship of MINT hypermethylation with the extent of methylation, as measured by the methylation index, in these tumors. In bladder cancer, there is an identifiable group of tumors (see Fig. 2) that show a greater proportion of gene promoter hypermethylation, whereas at the same time there is a group showing a complete lack of hypermethylation. Using our quasi-likelihood approach, we have examined the association between MINT loci hypermethylation and overall tumor suppressor gene hypermethylation (as measured by methylation index) in these tumors. We observed a significant association between MINT loci hypermethylation and methylation index in bladder cancer, NSCLC, and HNSCC, suggesting that MINT loci hypermethylation may be an appropriate marker for a generally higher degree of gene hypermethylation in these tumors. In MPM, there was no significant association between MINT loci hypermethylation and methylation index, although assessing this relationship was limited by the small number and generally lower methylation index of these tumors compared with the other tumor types, as shown in Fig. 1. This would not suggest, though, that a form of methylator phenotype should be ruled out in this disease, as we observed a number of tumors with a large number of hypermethylated loci. It will require larger series of these tumors, similar to what we have used for the other diseases, to be able to better define this phenotype in this disease.
Although the data in bladder cancer, HNSCC, and NSCLC suggest that MINT loci hypermethylation may be useful markers of a form of methylator phenotype, the relative similarity in the prevalence of CDKN2A hypermethylation and the general lack of MLH1 hypermethylation in these diseases further suggest that hypermethylation of these genes is not generally useful as a powerful delimiter of this phenotype, as has been suggested in colorectal cancer (10). The description of CIMP in colorectal cancer includes specific clinical phenotypes, such as a greater prevalence of a particular type of genomically unstable tumor, and tumor location, which is not observed in most solid tumors. Whereas the phenotype that we observed compared with that observed in colon cancer may have similarities, it also is clear that it has fundamental differences. This is not surprising given that the pathways for inactivation differ across tumors, the nature and type, duration, and intensity of carcinogen exposure differ, and underlying tissue susceptibility may also vary. Hence, whereas there may be differences in the proteins and genes responsible for epigenetic silencing thereby giving rise to what has been described as CIMP, it remains possible that the phenotype is a consequence of differences in clonal growth, carcinogen exposure, and tumor metabolism, rather than an underlying or induced difference in the propensity for methylation silencing. This may also explain the differences in the form of the phenotype between diseases. It will also be of interest to examine in these diseases and in others where CIMP has been reported if there are particular clinical correlates with a higher degree of methylation, such as those observed in colorectal cancer.
In our examination of the relationship of methylation index to MINT loci hypermethylation status, we also observed highly significant differences in the distribution of methylation across the tumor types. Our initial examination of methylation index suggests that NSCLC, for example, is characterized by a much larger methylation index than the other tumor types, a result consistent with previous reports comparing NSCLC and MPM (23). At the same time, we also noted the striking correlation between methylation of different genes, as well as the varied prevalence of promoter methylation at these loci, suggesting that simple counts of the number of genes methylated is not a statistically rigorous method to examine these differences. To overcome these challenges, we used novel statistical approaches that can more appropriately model this methylation data and can address the outstanding issue surrounding the presence of CIMP in these tumors and the propensity for methylation observed in some of these cancers.
Our results of this modeling suggested that methylation status at these gene promoters could not be most appropriately modeled into distinct classes or clusters, as would be expected if a true methylator phenotype existed. Instead, use of a latent trait method produces the most parsimonious model and suggests that there is an underlying propensity for hypermethylation, which may be driving the distribution of methylation across these diseases. Although beyond the scope of this examination, use of the item response theory modeling can also allow for an exploration of the factors, such as demographics and exposures that may be associated with this methylation propensity within each of these diseases, and can also be used to model clinical outcomes of the disease, most importantly patient survival. Thus, we believe that this type of modeling is critical to a better understanding of the biology driving promoter hypermethylation and may be able to provide clues to what specific factors are driving these epigenetic alterations in human cancer.
The difference in the propensity for methylation and the underlying biology driving this propensity may reflect an overall predisposition of specific tumors, or target tissues, to epigenetically silence tumor suppressor genes instead of undergoing genetic alterations. Indeed, to produce a malignant phenotype in the different tissues, distinct pathways are likely necessary to be inactivated. The difference between HNSCC and NSCLC is particularly surprising given the similar etiology of these diseases (both are highly associated with tobacco-smoke exposure), yet we see vastly different methylation indices and profiles in these diseases. This is consistent with our data relating exposures to specific gene hypermethylation; for example, CDKN2A is silenced at a similar prevalence in both NSCLC and HNSCC but shows a significant association with duration of tobacco smoking in NSCLC (42), but not in HNSCC (43). Therefore, the susceptibility of different tissues to exposure-related epigenetic alteration may be quite different and may depend on many factors, including the target cancer stem cell, the particular nature of the growth pathways active in different tissues, the type, duration, and intensity of carcinogen exposure, and additional underlying genetic or epigenetic susceptibility to either genotoxic insult or epigenetic inactivation.
It should be noted that our observation of these differences in the underlying propensity for hypermethylation across diseases may be biased by the set of gene promoters that we have examined herein. Using the same statistical approach, examination of a novel set of gene promoters within this population of tumors would help to clarify if the differences in methylation propensity are truly varied across tissue types. Additionally, replication of this work in an independent population of tumors would clarify if these results are generalizable to the population.
Our data represent a large body of information on the molecular and, particularly, the epigenetic profiles of four exposure-related tumor types. Although the profiles of promoter hypermethylation are distinct across these diseases, alone they are not sufficient for clinically useful classification of tumors. At the same time, our data strongly suggest that promoter hypermethylation of specific genes does not occur independently or randomly, but is instead the product of a specific and complex selection process driven, at least in part, by carcinogen exposures, innate susceptibility, and tissue-specific processes. Further work is needed to better characterize the etiology of this methylation phenotype as well as to determine if this phenotype has important prognostic or clinical use. Understanding the profiles of these epigenetic alterations may be critical for understanding the nature of the carcinogenic process.
 |
Acknowledgments
|
|---|
Grant support: NIH grants P42 ES 05947, P42 ES 007373, R01 CA 100679, R01 CA 78609, and T32 ES 007155, and the International Mesothelioma Program.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Received 5/ 9/06.
Revised 8/29/06.
Accepted 9/11/06.
 |
References
|
|---|
- Feinberg AP, Ohlsson R, Henikoff S. The epigenetic progenitor origin of human cancer. Nat Rev Genet 2006;7:2133.[CrossRef][Medline]
- Toyota M, Ahuja N, Ohe-Toyota M, Herman JG, Baylin SB, Issa JP. CpG island methylator phenotype in colorectal cancer. Proc Natl Acad Sci U S A 1999;96:86816.[Abstract/Free Full Text]
- Li Q, Jedlicka A, Ahuja N, et al. Concordant methylation of the ER and N33 genes in glioblastoma multiforme. Oncogene 1998;16:3197202.[CrossRef][Medline]
- Abe M, Ohira M, Kaneda A, et al. CpG island methylator phenotype is a strong determinant of poor prognosis in neuroblastomas. Cancer Res 2005;65:82834.[Abstract/Free Full Text]
- Toyota M, Ahuja N, Suzuki H, et al. Aberrant methylation in gastric cancer associated with the CpG island methylator phenotype. Cancer Res 1999;59:543842.[Abstract/Free Full Text]
- Shen L, Ahuja N, Shen Y, et al. DNA methylation and environmental exposures in human hepatocellular carcinoma. J Natl Cancer Inst 2002;94:75561.[Abstract/Free Full Text]
- Roman-Gomez J, Jimenez-Velasco A, Agirre X, Prosper F, Heiniger A, Torres A. Lack of CpG island methylator phenotype defines a clinical subtype of T-cell acute lymphoblastic leukemia associated with good prognosis. J Clin Oncol 2005;23:70439.[Abstract/Free Full Text]
- Chan AO, Kim SG, Bedeir A, Issa JP, Hamilton SR, Rashid A. CpG island methylation in carcinoid and pancreatic endocrine tumors. Oncogene 2003;22:92434.[CrossRef][Medline]
- Strathdee G, Appleton K, Illand M, et al. Primary ovarian carcinomas display multiple methylator phenotypes involving known tumor suppressor genes. Am J Pathol 2001;158:11217.[Abstract/Free Full Text]
- Issa JP. CpG island methylator phenotype in cancer. Nat Rev Cancer 2004;4:98893.[CrossRef][Medline]
- Anacleto C, Leopoldino AM, Rossi B, et al. Colorectal cancer "methylator phenotype": fact or artifact? Neoplasia 2005;7:3315.[CrossRef][Medline]
- Nelson HH, Christiani DC, Mark EJ, Wiencke JK, Wain JC, Kelsey KT. Implications and prognostic value of K-ras mutation for early-stage lung cancer in women. J Natl Cancer Inst 1999;91:20328.[Abstract/Free Full Text]
- Marsit CJ, Karagas MR, Andrew A, et al. Epigenetic inactivation of SFRP genes and TP53 alteration act jointly as markers of invasive bladder cancer. Cancer Res 2005;65:70815.[Abstract/Free Full Text]
- Kraunz KS, Nelson HH, Lemos M, Godleski JJ, Wiencke JK, Kelsey KT. Homozygous deletion of p16(INK4a) and tobacco carcinogen exposure in nonsmall cell lung cancer. Int J Cancer 2006;118:13649.[CrossRef][Medline]
- Hirao T, Bueno R, Chen CJ, Gordon GJ, Heilig E, Kelsey KT. Alterations of the p16(INK4) locus in human malignant mesothelial tumors. Carcinogenesis 2002;23:112730.[Abstract/Free Full Text]
- Herman JG, Graff JR, Myohanen S, Nelkin BD, Baylin SB. Methylation-specific PCR: a novel PCR assay for methylation status of CpG islands. Proc Natl Acad Sci U S A 1996;93:98216.[Abstract/Free Full Text]
- van Engeland M, Roemen GM, Brink M, et al. K-ras mutations and RASSF1A promoter methylation in colorectal cancer. Oncogene 2002;21:37925.[CrossRef][Medline]
- Virmani AK, Rathi A, Sathyanarayana UG, et al. Aberrant methylation of the adenomatous polyposis coli (APC) gene promoter 1A in breast and lung carcinomas. Clin Cancer Res 2001;7:19982004.[Abstract/Free Full Text]
- Virmani A, Rathi A, Sugio K, et al. Aberrant methylation of TMS1 in small cell, non small cell lung cancer and breast cancer. Int J Cancer 2003;106:198204.[CrossRef][Medline]
- Sathyanarayana UG, Maruyama R, Padar A, et al. Molecular detection of noninvasive and invasive bladder tumor tissues and exfoliated cells by aberrant promoter methylation of laminin-5 encoding genes. Cancer Res 2004;64:142530.[Abstract/Free Full Text]
- Suzuki H, Watkins DN, Jair KW, et al. Epigenetic inactivation of SFRP genes allows constitutive WNT signaling in colorectal cancer. Nat Genet 2004;36:41722.[CrossRef][Medline]
- Zochbauer-Muller S, Fong KM, Virmani AK, Geradts J, Gazdar AF, Minna JD. Aberrant promoter methylation of multiple genes in non-small cell lung cancers. Cancer Res 2001;61:24955.[Abstract/Free Full Text]
- Toyooka S, Pass HI, Shivapurkar N, et al. Aberrant methylation and simian virus 40 tag sequences in malignant mesothelioma. Cancer Res 2001;61:572730.[Abstract/Free Full Text]
- Chan AO, Broaddus RR, Houlihan PS, Issa JP, Hamilton SR, Rashid A. CpG island methylation in aberrant crypt foci of the colorectum. Am J Pathol 2002;160:182330.[Abstract/Free Full Text]
- Lee S, Kim WH, Jung HY, Yang MH, Kang GH. Aberrant CpG island methylation of multiple genes in intrahepatic cholangiocarcinoma. Am J Pathol 2002;161:101522.[Abstract/Free Full Text]
- Eads CA, Danenberg KD, Kawakami K, et al. MethyLight: a high-throughput assay to measure DNA methylation. Nucleic Acids Res 2000;28:E32.
- Sathyanarayana UG, Moore AY, Li L, et al. Sun exposure related methylation in malignant and non-malignant skin lesions. Cancer Lett. Epub 2006 Feb 20.
- Team RDC. R: A language and environment for statistical computing. Vienna (Austria): R Foundation for Statistical Computing; 2005.
- Bartholomew DJ. Latent variable models and factor analysis. London: Charles Griffin & Co. Ltd.; 1987. p. x, 193 p.
- Bandeen-Roche K, Miglioretti DL, Zeger SL, Rauthouz PJ. Latent variable regression for multiple discrete outcomes. J Am Stat Assoc 1997;92:137586.[CrossRef]
- Lindsay B, Clogg CC, Grego J. Semiparametric estimation in the Rasch model and related exponential response models, including simple latent class model for item analysis. J Am Stat Assoc 1991;86:96107.[CrossRef]
- Hastie T, Tibshirani R, Friedman J. The elements of statistical learning. New York: Springer-Verlag; 2001. p. 2147.
- Ripley B. tree: classification and regression trees. R package version 1.0-19. edition 2005.
- Agresti A. 7.1 Nominal responses: baseline-category logit models. In: Categorical data analysis. 2nd ed. Hoboken (NJ): John Wiley & Sons; 2002. p. 26772.
- Hastie T, Tibshirani R, Friedman JH. The elements of statistical learning: data mining, inference, and prediction. New York: Springer; 2001. p. xvi, 533 p.
- Houseman EA, Coull BA, Betensky RA. Feature-specific penalized latent class analysis for genomic data. Biometrics. In press 2006.
- Akaike H. A new look at the statistical model identification. IEEE Trans Automat Contr AC 1974;18:71623.
- Esteller M, Corn PG, Baylin SB, Herman JG. A gene hypermethylation profile of human cancer. Cancer Res 2001;61:32259.[Abstract/Free Full Text]
- Marsit CJ, Karagas MR, Danaee H, et al. Carcinogen exposure and gene promoter hypermethylation in bladder cancer. Carcinogenesis 2006;27:1126.[Abstract/Free Full Text]
- Toyooka S, Tokumo M, Shigematsu H, et al. Mutational and epigenetic evidence for independent pathways for lung adenocarcinomas arising in smokers and never smokers. Cancer Res 2006;66:13715.[Abstract/Free Full Text]
- Toyota M, Issa JP. CpG island methylator phenotypes in aging and cancer. Semin Cancer Biol 1999;9:34957.[CrossRef][Medline]
- Kim DH, Nelson HH, Wiencke JK, et al. p16(INK4a) and histology-specific methylation of CpG islands by exposure to tobacco smoke in non-small cell lung cancer. Cancer Res 2001;61:341924.[Abstract/Free Full Text]
- Kraunz KS, Hsiung D, McClean MD, et al. Dietary folate is associated with p16ink4a methylation in head and neck squamous cell carcinoma. Int J Cancer 2006;119:15537.[CrossRef][Medline]
This article has been cited by other articles:

|
 |

|
 |
 
J. Veeck, C. Geisler, E. Noetzel, S. Alkaya, A. Hartmann, R. Knuchel, and E. Dahl
Epigenetic inactivation of the secreted frizzled-related protein-5 (SFRP5) gene in human breast cancer is associated with unfavorable prognosis
Carcinogenesis,
May 1, 2008;
29(5):
991 - 998.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
L. Shen, P. J. Catalano, A. B. Benson III, P. O'Dwyer, S. R. Hamilton, and J.-P. J. Issa
Association between DNA Methylation and Shortened Survival in Patients with Advanced Colorectal Cancer Treated with 5-Fluorouracil Based Chemotherapy
Clin. Cancer Res.,
October 15, 2007;
13(20):
6093 - 6098.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
C. J. Marsit, E. A. Houseman, A. R. Schned, M. R. Karagas, and K. T. Kelsey
Promoter hypermethylation is associated with current smoking, age, gender and survival in bladder cancer
Carcinogenesis,
August 1, 2007;
28(8):
1745 - 1751.
[Abstract]
[Full Text]
[PDF]
|
 |
|