The Polycomb Group (PcG) protein EZH2 is a critical component of a multiprotein complex that methylates Lys27 of histone 3 (H3K27), which consequently leads to the repression of target gene expression. We have previously reported that EZH2 is overexpressed in metastatic prostate cancer and is a marker of aggressive diseases in clinically localized solid tumors. However, the global set of genes directly regulated by PcG in tumors is largely unknown, and thus how PcG mediates tumor progression remains unclear. Herein we mapped genome-wide H3K27 methylation in aggressive, disseminated human prostate cancer tissues. Integrative analysis revealed that a significant subset of these genes are also targets of PcG in embryonic stem cells, and their repression in tumors is associated with poor prognosis. By stepwise cross-validation, we developed a “Polycomb repression signature” composed of 14 direct targets of PcG in metastatic tumors. Notably, solid tumor subtypes in which this gene signature is repressed show poor clinical outcome in multiple microarray data sets of tumors including breast and prostate cancer. Taken together, our results show a fingerprint of PcG-mediated transcriptional repression in metastatic prostate cancer that is reminiscent of stem cells and associated with cancer progression. Therefore, PcG proteins play a central role in the epigenetic silencing of target genes and functionally link stem cells, metastasis, and cancer survival. [Cancer Res 2007;67(22):10657–63]
- genome-wide location analysis
- Polycomb regulation
- cancer outcome prediction
- molecular signature
- stem cell
Polycomb group (PcG) proteins are transcriptional repressors with important roles in preserving cellular identity. The PcG proteins, EZH2 (enhancer of zeste 2), SUZ12 (suppressor of zeste 12), and EED (embryonic ectoderm development), form the Polycomb Repressive Complex 2 (PRC2) and specifically trimethylate H3K27 on target gene promoters ( 1). This histone mark is part of a preprogrammed cellular memory system that is inheritable through mitotic cell divisions and thus preserves cellular identity. PcG proteins have recently been implicated in the maintenance of stem cells ( 2). Genome-wide location analysis revealed that Polycomb represses a special set of developmental regulators and signaling molecules, thus maintaining the pluripotency of human ( 3) and murine ( 4) embryonic stem cells. Dysregulation of PcG proteins may lead to maldifferentiation, a hallmark of cancer.
Prostate cancer is the leading cause of cancer-related death in American men. New prognostic biomarkers are required to enhance risk assessment and individualize medicine for cancer patients because a high percentage of patients incur disease recurrence ( 5). Histone modification patterns have been found to predict risk of prostate cancer recurrence ( 6), indicating a role of epigenetic mechanisms in cancer progression. Concordantly, EZH2 and SUZ12 are frequently overexpressed in aggressive tumors including prostate and breast cancers ( 7). In addition, previous studies have suggested that stem cell Polycomb target genes are predisposed to DNA hypermethylation, and thus repression, in cancer ( 8), relating Polycomb-mediated transcriptional repression with cancer. PcG target genes in human tumors, however, remain widely undiscovered and, consequently, their roles in cancer development unknown. In this study, we mapped PRC2 target genes in aggressive prostate tumors and investigated their association with cancer outcome.
Cells and human tissues. LNCaP and PC3 cells were cultured in RPMI supplemented with 10% fetal bovine serum (Invitrogen). Prostate cancer tissues were collected from the Rapid Autopsy Program, University of Michigan Prostate Cancer Specialized Program of Research Excellence Tissue Core, with informed consent of the patients and prior institutional review board approval.
Chromatin immunoprecipitation and genome-wide location analysis. Chromatin immunoprecipitation (ChIP) on chip was done using the Agilent proximal promoter arrays according to the manufacturer's protocols (Supplementary Methods). Antibodies (5 μg) used for ChIP include monoclonal anti-EZH2 (BD), polyclonal anti-SUZ12 (Upstate), and polyclonal anti-H3K27me3 (Upstate) antibodies.
Data sets. All expression microarray data sets were collected from Oncomine ( 9), 8 and contained solely primary tumors (breast or prostate cancer), except for the Yu et al. data set ( 10), which also contained benign, adjacent-to-cancer, and metastatic tissues. For evaluation of prognostic power of gene signature, only the primary tumor samples were used.
Statistical analysis. To compare primary to metastatic prostate cancer, gene differential analysis was done by Cyber-T statistic. The enrichment of PRC2-occupied genes in our prostate cancer profiling data was assessed by Gene Set Enrichment Analysis (GSEA). Hierarchical clustering, k-nearest neighbor classifier, and Kaplan-Meier survival analysis were used for the clustering of training data, prediction of validation data, and analysis of clinical outcome, respectively. The end point for Kaplan-Meier survival analysis is designated to 10-year recurrence-free survival unless the data set only provided overall survival information. All analyses were completed in R, 9 except for Kaplan-Meier survival plots, which were generated in SPSS11.5 (SPSS, Inc.). A detailed description is given in Supplementary Methods.
Genome-wide mapping of H3K27me3 in metastatic prostate cancer tissues. To investigate the mechanism of PcG proteins in regulating cancer progression, we mapped genomic sites occupied by PRC2 in late-stage, aggressive prostate cancer tissues by combining ChIP with promoter arrays ( Fig. 1 ). By integrating genome-wide location data with cancer expression profiling data, we identified genes directly repressed by PRC2, termed “Polycomb repression signature”, in cancer. As Polycomb contributes to maintaining the undifferentiated state of stem cells, we hypothesized that its target genes may be important for cancer progression. We thus investigated their association with cancer outcome by Kaplan-Meier survival analysis of multiple cancer microarray data sets as illustrated in Fig. 1.
We carried out genome-wide location analysis of RPC2 and H3K27me3 in LNCaP human prostate cancer cell lines as well as in three metastatic prostate cancer (two to the liver and one to the lung) tissues from independent patients. As described in Supplementary Discussion and Supplementary Fig. S1, we observed strong overlap between replicate H3K27me3 ChIP-on-chip experiments as well as between SUZ12 and H3K27me3 ChIP-on-chip in both in vitro cell line model and in vivo human tumors. In addition, metastatic prostate cancer tissues from different patients and metastatic sites share a common set of H3K27me3-marked genes.
H3K27me3-marked genes link metastatic prostate cancer to stem cells. To provide functional relevance for the cancer H3K27me3-occupied gene sets, we compared them to molecular correlates in the Oncomine Molecular Concepts Map (MCM; ref. 9), a resource containing ∼15,000 molecular concepts or biologically related gene sets, for enrichment by disproportionate overlap using Fisher's exact test. MCM analysis of 1,165 H3K27me3-occupied genes with >5-fold enrichment in metastatic prostate cancer tissue (to the liver) revealed an intriguing enrichment network ( Fig. 2A and Supplementary Table S1). The most enriched gene expression concepts (P < 1.0 × 10−7) are “genes down-regulated in prostate, breast, and lung cancers.” This observation is consistent with the expected repression of PcG target genes, as the transcriptional repressor EZH2 is up-regulated in cancer. In addition, a significant (P < 1.1 × 10−5) portion of our gene set is located at chromosomes 1q, 17q, 19q, and 20q, all of which have previously been associated with prostate cancer ( 11).
Interestingly, the most enriched literature concepts (P = 1.2 × 10−100) are “H3K27me3-, SUZ12-, or EED-occupied in embryonic stem cells or embryonic fibroblasts,” revealing a novel link of Polycomb cancer targets to those in stem cells. Importantly, the enrichment of our concept among the embryonic stem concepts is comparable to that between embryonic stem concepts. For example, the “H3K27me3-occupied in embryonic stem cell” gene set is enriched by our gene set with OR of 5.69 and P = 1.2 × 10−100 and by the “H3K27me3-occupied in embryonic fibroblasts” concept with comparable OR of 10.6 and P = 1.1 × 10−100. In addition, the most enriched Gene Ontology concepts (P < 1.6 × 10−7) include developmental regulators, homeobox proteins, and transcription factors, being consistent with previous reports of PcG target genes in embryonic stem cells ( 3). As Polycomb-mediated repression is known to control stem cell pluripotency and differentiation ( 3), we hypothesized that it may be critical for cancer progression. Notably, MCM analysis showed significant links (P < 1.0 × 10−7) to gene sets down-regulated in recurrent prostate and breast cancers.
We thus sought to develop a Polycomb repression signature in tumor. We selected a common set of 336 H3K27me3-occupied genes from the two metastatic prostate tumors to the livers. Gene Set Enrichment Analysis of these genes in a microarray profiling data set of five benign, six clinically localized, and five metastatic prostate cancers ( 12) indicated a significant enrichment with down-regulated expression in metastasis (P = 0.01; false discovery rate, 0.009). A set of 87 PcG-occupied genes with the strongest repression during metastasis (P < 0.05 by Cyber-T statistic) was selected and defined as the Polycomb repression signature in metastatic prostate cancer ( Fig. 2B and Supplementary Table S2). MCM analysis of these 87 genes preserved the significant links identified above (Supplementary Fig. S2).
The Polycomb repression signature predicts survival of cancer patients. To evaluate the predictive value of the Polycomb repression signature, we examined an independent prostate cancer data set ( 13). Primary prostate cancer (n = 61) samples were classified into two prognostic groups based on the expression patterns of signature genes. Kaplan-Meier analysis revealed that the resulted two clusters differed significantly in clinical outcome [P = 0.03; hazard ratio, 2.6; 95% confidence interval (95% CI), 1.04–6.54; Fig. 2C]. We thus defined the cluster with favorable outcome as the “low-risk” group and the other as the “high-risk” group. To validate our signature, we predicted samples in an independent prostate cancer data set ( 14) by k-nearest neighbor classification (k = 5) to be in either high-risk or low-risk group. Interestingly, the resulted two groups showed significant difference in clinical outcome (P = 0.0008; hazard ratio, 3.09; 95% CI, 1.54–6.18; Fig. 2C).
Because MCM analysis also linked PcG-occupied genes to breast cancer survival ( Fig. 2A; P = 9.7 × 10−8), we evaluated the prognostic value of the Polycomb repression signature in breast cancer. An approach analogous to above prostate cancer outcome analysis was taken to cluster samples of the Wang et al. estrogen receptor–positive breast cancer training data set ( 15) and to predict samples of the Pawitan et al. and the Miller et al. validation data sets ( 16, 17). Importantly, the low-risk and high-risk groups from both the training and the validation data sets showed significant difference in patient relapse ( Fig. 2D).
Because molecular classifiers composed of a small number of genes are especially useful in clinical practice, we thus attempted to refine our Polycomb repression signature. We adopted a strategy of cross-validation with stepwise decrement on the number of genes used for classification, and identified a 14-gene signature (Supplementary Table S3) that minimized the cross-validation errors in the Wang et al. breast cancer data set ( 15). Importantly, as described in the Supplementary Discussion and Supplementary Fig. S3, this 14-gene signature is able to predict patient survival with high significance in six breast and two prostate cancer data sets, and with marginal significance in several glioma and lung adenocarcinoma data sets. A comparison to previously reported molecular classifiers found that our signature overall outperformed the others (Supplementary Discussion and Supplementary Table S4).
Interestingly, multivariate Cox proportional hazards regression analysis of our signature in the independent van de Vijver et al. breast cancer data set ( 18) revealed significant association with both relapse-free (P = 0.007; hazard ratio, 1.93; 95% CI, 1.20–3.11) and overall survival (P = 0.002; hazard ratio, 3.15; 95% CI, 1.54–6.43). This is independent from established clinical and pathologic variables, such as tumor grade and node status, as well as of greater significance ( Table 1 ). Therefore, our signature provides additional prognostic information beyond standard clinical and pathologic variables.
Polycomb repression signature genes are epigenetically repressed in aggressive tumors. We next sought to confirm, at individual gene level, the epigenetic repression of our signatures genes in aggressive tumors. Genome-wide location analysis of H3K27me3 showed high enrichment ratios for the promoters of a randomly selected 6 genes ( Fig. 3A and Supplementary Fig. S4). By ChIP-PCR, we confirmed that these gene promoters contain the H3K27me3 mark in three metastatic prostate cancer tissues. Interestingly, no apparent enrichment of H3K27me3 was observed in localized prostate cancer for five of the six genes, whereas a positive control gene, KCNA1 ( 1), was enriched in all cancer samples tested ( Fig. 3B). In the PC3 prostate cancer cell line, we confirmed that EZH2, SUZ12, and H3K27me3 co-occupy the promoters of all six genes. Importantly, quantitative reverse transcription-PCR (RT-PCR) analysis of three benign, five localized, and seven metastatic prostate cancer tissues showed marked repression of these genes in metastatic samples ( Fig. 3C). For example, WNT2, CXCL12, and KRT17 are >100-fold down-regulated.
Prostate cancer is, in general, a slowly progressing cancer that varies greatly in clinical outcome, depending on the aggressiveness of an individual tumor. Currently, the most important clinical prognostic indicators of disease outcome are pretherapy prostate-specific antigen, and Gleason score. Nevertheless, many patients incur disease recurrence ( 5), and thus additional prognostic biomarkers are needed to provide better risk assessment and therapy selection. PRC2 complex proteins are histone methyltransferases that are frequently up-regulated in aggressive tumors with their downstream targets and underlying mechanisms widely unknown. Although genome-wide location analyses of PRC2 have been done in cell lines, a similar study has not been carried out directly in human tissues. Several lines of evidence support the success of our ChIP-on-chip analysis of PRC2 in prostate cancer. Similar to previous reports ( 19), we observed a highly significant overlap between RPC2- and H3K27me3-occupied genes. A large number of these genes are down-regulated in cancer and associated with poor patient survival, being consistent with EZH2 up-regulation in aggressive cancer. In addition, we have confirmed a randomly selected subset for their epigenetic silencing in cancer. Cancer PRC2-occupied genes largely overlap with stem cell PRC2 targets and include previously identified functional categories such as transcriptional factors, developmental regulators, and genes involved in receptor activities ( 3, 19), suggesting a Polycomb-mediated transcriptional fingerprint in cancer.
The overrepresentation of cancer PRC2-occupied genes in stem cell concepts may have important implications in the recent model of cancer stem cells. In embryonic stem cells, PRC2-mediated epigenetic silencing maintains the pluripotent stem cell identity ( 3, 4). An exciting theme is emerging that PcG target genes are predisposed to DNA hypermethylation in cancer ( 8) and these epigenetic changes may convey heritable gene expression patterns critical for neoplastic initiation and cancer progression ( 20). Our results, in addition, show a Polycomb-mediated epigenetic program in metastatic cancer cells that is associated with cancer outcome. Because PcG proteins are often expressed at a very low level in differentiated cell types ( 3), it is unlikely that the stem cell–like chromatin structure preexists in normal adult tissues. Aggressive tumor cells may have acquired this signature during cancer progression, either through dedifferentiation of mature cells or by mutation to adult stem cells (Supplementary Fig. S5).
Our Polycomb repression signature predicts clinical outcome of multiple solid tumors and is different from currently used tumor biomarkers. Unlike prostate-specific antigen for prostate cancer, which is the product of differentiated tumor cells, Polycomb repression signature genes are likely products of EZH2-expressing cancer stem cells or their immediate descendents. These genes encode biomarkers that may capture early abnormality in stem cell–initiated diseases and lead to cancer detection at earlier stages of carcinogenesis. Due to its low gene number and strong association with cancer outcome in multiple patient cohorts, our 14-gene signature is vastly feasible to develop into prognostic assays for clinical usage. In addition, the signature genes may be important tumor suppressor genes and may facilitate the understanding of PcG-mediated tumorigenesis. Taken together, PcG proteins play a unifying role in regulating stem cell, metastasis, and cancer survival by epigenetic silencing of key target genes.
Grant support: NIH grants RO1 CA97063 (A.M. Chinnaiyan and D. Ghosh) and RO1 CA102872 (K.J. Pienta); Early Detection Research Network grant U01 CA111275 (A.M. Chinnaiyan and D. Ghosh); Specialized Program of Research Excellence grant P50 CA69568 (K.J. Pienta, A.M. Chinnaiyan, and D. Ghosh); Department of Defense grants PC060266 (J. Yu), PC040517 (R. Mehra), and PC051081 (A.M. Chinnaiyan and S. Varambally); and the Ralph Wilson Medical Research Foundation Grant (K.J. Pienta). Burroughs Welcome Foundation Award in Clinical Translational Research (A.M. Chinnaiyan), American Cancer Society Award (K.J. Pienta), and Medical Scientist Training Program and a Rackham Predoctoral Award (S.A. Tomlins).
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
We thank Dr. Alan Dombkowski and Daniela Cukovic (Wayne State University) for help with the Agilent platform.
Oncomine is freely available to the academic community and was used for this study. Commercial rights to Oncomine have been licensed to Compendia Biosciences. A.M. Chinnaiyan is a co-founder of Compendia Biosciences and serves as a head of the scientific advisory board.
Note: Supplementary data for this article are available at Cancer Research Online (http://cancerres.aacrjournals.org/).
Jindan Yu and Jianjun Yu contributed equally to this work.
- Received July 9, 2007.
- Revision received September 17, 2007.
- Accepted September 27, 2007.
- ©2007 American Association for Cancer Research.