The CpG island spanning the transcription start of the glutathione S-transferase P1 becomes methylated in a variety of human cancers including breast cancer. To study the effect of sequence variation on hypermethylation of the GSTP1 promoter, we analyzed the genetic and epigenetic variability in 90 tumors from patients with locally advanced breast cancer. High-resolution quantitative analysis revealed large variability in the DNA methylation levels. Lack of methylation was more often observed in the basal and normal-like estrogen receptor (ER)-negative tumors, and methylated GSTP1 was associated with better overall survival (P = 0.00063). Studies of the genetic variation identified 14 different haplotypes. The distribution of methylation levels of tumors homozygous for the most frequent haplotype was significantly different from other haplotype combinations (P = 0.011), the difference being more pronounced in ER-positive (P = 0.005) and progesterone receptor–positive (P = 0.008) tumors. Regression modeling identified the ER status and haplotype as the main determinants of DNA methylation variability. We identified a putative c-Myb response element (MRE) that was present in one of two minimal promoter haplotypes. In vitro analysis showed that c-Myb binds to the MRE, but binding was weakened by the two polymorphisms. Transient cotransfections in luminal-type and basal-like breast cancer cell lines confirmed cell-specific differential binding of c-Myb to the polymorphic sites, leading to a change in the expression from the GSTP1 promoter in vivo. GSTP1 expression was moderately but significantly (P = 0.01) reduced after siRNA-mediated knockdown of c-Myb. Our results indicate that haplotype structure of a promoter is important for the extent of DNA methylation. [Cancer Res 2008;68(14):5562–71]
- tandem repeat polymorphism
- single nucleotide polymorphism
- DNA methylation
- transcriptional activation
DNA methylation of CpG islands is associated with transcriptional silencing of tumor suppressor genes in cancer. Epigenetic gene silencing is a complex process where changes in DNA methylation, histone modifications, and in some cases, small noncoding RNAs lead to nucleosome remodeling and a transcriptionally silent state ( 1). The human genome can be divided in actively transcribed domains known as euchromatin and transcriptionally silent domains known as heterochromatin. Boundaries that protect against heterochromatin-mediated silencing known as barrier insulators have been isolated from several organisms ( 2). The spreading of heterochromatin is disrupted by nucleosome-excluding sequence elements in particular (CCGNN)n and poly(dA-dT) sequence elements ( 3). An example in the human genome is the (ATAAA)n repeat in the 5′ untranslated region of the GSTP1 promoter, which is degenerated by insertions of CAC, ATT, and other motifs ( 4). GSTP1 is the only gene of the human glutathione S-transferase (GST) P subfamily ( 5). GSTs provide protection to mammalian cells against electrophilic metabolites of carcinogens and reactive oxygen species ( 6). GSTP1 plays a role in regulating the Map kinase pathway via protein-protein interactions as it is an inhibitor of c-Jun NH2-terminal kinase 1, a kinase involved in stress response, apoptosis, and cellular proliferation ( 7, 8). Elevated expression of the GSTP1 gene has been reported to correlate with drug resistance in human cancers ( 8, 9), and high levels have been associated with poor prognosis in breast and colon cancer ( 10, 11). Hypermethylation of the GSTP1 promoter was associated with silencing of the gene in prostate cancer ( 12) and breast cancer cell lines ( 13). GSTP1 has been found to be frequently methylated in liver, breast, renal, and lung carcinoma ( 14). Recently GSTP1 CpG island hypermethylation was found to be significantly associated with tumor size, lymph node metastasis, and relapse-free survival in breast cancer ( 15). Data concerning the time point of hypermethylation of the GSTP1 promoter during the multistep process of breast carcinogenesis is thus far conflicting. GSTP1 hypermethylation has been found in precursor lesions whereby the methylation status correlated inversely with the expression ( 16). However, a recent more quantitative analysis revealed no biologically significant methylation in peritumoral or hyperplastic breast tissue ( 17).
The above described GSTP1 (ATAAA)n repeat separates methylated from unmethylated sequences in normal prostate tissue. These separate methylation domains are lost in prostate cancer where methylation extends throughout the whole promoter region ( 18). The underlying mechanism has thus far not been elucidated, but loss of binding sites for the transcription factor Sp1 in combination with random seeds of methylation in prostate cancer cells have been proposed to be required for hypermethylation of GSTP1 ( 19). These combinatorial effects subsequently lead to histone deacetylation, chromatin remodeling, and gene silencing ( 20).
The involvement of GSTP1 in drug metabolism and its potential effects on outcome of cancer therapy make it important to understand the mechanisms of GSTP1 regulation in both normal and cancer tissues. In this work, we have studied tumors from patients with locally advanced breast cancer and report the detailed haplotype structure of the minimal human GSTP1 promoter, the methylation pattern of six CpG sites in the promoter by quantitative measures, and the effects of the haplotype structure on the degree of DNA methylation. Functional studies were performed to investigate the potential effect of two single nucleotide polymorphism (SNP) in a putative binding site of the transcription factor c-Myb, which might provide a mechanism for the differential methylation of the two haplotypes. Our study provides the first evidence for a close interplay between haplotypes in cis and levels of DNA methylation in the regulation of a specific gene in cancer.
Materials and Methods
Patient material. DNA was obtained from 90 patients with locally advanced breast cancer (T3/T4 and/or N2 tumors) by a pretreatment open incision biopsy of the tumor taken prior to chemotherapy. In addition to their locally advanced tumor, 12 patients had a minor distant metastasis at the time of diagnosis. The primary treatment consisted of weekly doxorubicin treatment (14 mg/m2) scheduled for 16 wk. If a patient expressed a progressive disease after 4 wk of treatment according to the Unio Internationale Contra Cancrum (Italian) criteria, treatment with doxorubicin was terminated and alternative treatment procedures were implemented. Patients with an operable tumor after neoadjuvant treatment had surgery followed by radiotherapy immediately after termination of the neodajuvant chemotherapy, whereas the remaining patients were treated on an individual basis. Women with estrogen- and/or progesterone-positive tumors (n = 84) were treated with tamoxifen (30 mg daily for 5 y). Of the 90 tumor samples, 74 were invasive ductal carcinomas, 8 were lobular carcinomas, and 8 were classified as other histologic types. Eleven tumors were HER2 positive. The patients were between ages 32 and 88 y at the time of diagnosis with a median age of 64.5 y. The study protocol was approved by the local ethical committee and the patients gave their informed consent. The sample cohort is described in reference ( 21). Tumor DNA from 80 patients was available in sufficient quantity for methylation analysis, and 50 of the tumors have previously been analyzed for gene expression using genome wide cDNA microarrays ( 22). Normal breast tissue from six individuals was included as control samples for methylation analysis to detect potential age-related methylation changes. DNA from the snap-frozen tumor tissue was isolated using phenol/chloroform extraction method according to standard procedures (Nucleic Acid Extractor 340A; Applied Biosystems).
Heteroduplex analysis of ATAAA repeat. The polymorphic nature of the ATAAA repeat of the GSTP1 gene has been described ( 4). The ATAAA repeat was analyzed by fragment length analysis and heteroduplex analysis. DNA from 90 locally advanced breast tumors was amplified to give a 334- to 368-bp-long PCR fragment. The PCR was carried out by mixing 30 pmol fluorescently labeled forward primer 5′-6-FAM-GTTGCAGTGAGCCGCGCAGATC-3′ and 30 pmol reverse primer 5′-AGTAAACAGACAGCAGGAAG-3′, 50 ng genomic DNA, 2.5 mmol of each deoxynucleotide triphosphate (dNTP), 2.8 mmol/L MgCl2, and 0.625 U AmpliTaq Gold DNA polymerase with 1× PCR buffer II (MgCl2 free; Applied Biosystems) to a final volume of 25 μL PCR cycling program consisted of initial activation of the enzyme at 95°C for 9 min, 43 cycles of 94°C for 20 s, 68°C for 1 min and 50 s, and a final extension at 60°C for 10 min in a GeneAmp PCR system 9600 (Applied Biosystems). The PCR products were separated by capillary electrophoresis under denaturing conditions using an ABI Prism 310 automated sequencer (Applied Biosystems), and the analysis was performed using the GeneScan 1.2.2 software (Applied Biosystems). Seven of thirteen alleles can be differentiated by this method. Samples with the fragment lengths of 348 bp (B and L allele), 353 bp (A and C allele), or 358 bp (D and H allele) were genotyped using heteroduplex analysis. Heteroduplex analysis was performed by running 10 μL of the initial PCR product at 150 V in modular Mini-PROTEAN II Electrophoresis system (Bio-Rad) using 7.5% polyacrylamide gel (nondenaturing) for 90 min in 1× TAE buffer. PCR fragments and heteroduplexes were visualized by EtBr staining.
DNA sequencing of minimal promoter. DNA from 81 tumors were amplified with forward primer 5′-AAGCAATTTCCTTTCCTCTAAGC-3′ and reverse primer 5′-CACTGGTGGCGAAGACTG-3′ to yield a PCR product covering the region from position −411 to +30 of the GSTP1 promoter. A 20-μL PCR reaction consisted of 50 ng DNA, 125 μmol/L of each dNTP, 0.5 μmol/L each primer, 1 U HotStar Taq DNA polymerase (Qiagen). PCR cycling is as follows: 15 min 95°C, 1 min 95°C, 30 s 62°C 30 s, 1 min 72°C for 30 cycles, and final extension for 10 min at 72°C. PCR was carried out in a PTC-225 thermal cycler (Bio-Rad). Removal of excess primers and unincorporated dNTPs was performed using filter plates (Millipore). Sequencing was performed using BigDye v1.1 (Applied Biosystems) in a 20-μL reaction with 1-μL PCR product, 3.2 pmol of primer 5′-AAGCAATTTCCTTTCCTCTAAGC-3′, or 5′-CACTGGTGGCGAAGACTG-3′. Sequencing reaction cleanup was performed with ethanol/EDTA precipitation.
Methylation analysis. Bisulfite treatment was carried out using the CpG Genome DNA Modification kit (Millipore). Two micrograms of genomic DNA was incubated for 20 h at 55°C in the dark in 550 μL of freshly prepared Reagent I. DNA modification reagent II (750 μL) was added to 5 μL DNA modification reagent III and 3 μL Qiaex II (Qiagen) and incubated 10 min at room temperature. After washing with ethanol, the DNA was eluted in 50 μL 0.5× TE buffer. Fifty nanograms of DNA were used in each PCR to generate a 144-bp product in a total volume of 20 μL. Two microliters of 10× HotStar buffer (with MgCl2), 7.5 pmol of forward primer (5′-GTGATTTAGTATTGG) and 7.5 pmol reverse primer (5′-AACTCTAAACCCCATC), 0.2 mmol/L of each dNTP, and 2 U Hot Star DNA polymerase (Qiagen) were used. The PCR program consisted of an initial activation step of 15 min at 95°C, 40 cycles of 95°C for 30 s, 50°C for 45 s, 72°C for 20 s, and a final extension step of 4 min at 72°C in a PTC-225 thermal cycler (Bio-Rad). PCR products were pipetted into 384-well motherplates using a BasePlate robot (The Automation Partnership) that was also used for all further steps of the assay ( 23). From these plates, daughter plates containing 3 μL PCR product were made and overlaid with ∼3 μL mineral oil. Two microliters of H2O containing 0.25 U of shrimp alkaline phosphatase were added to the 3 μL PCR product and incubated for 1 h at 37°C, followed by denaturation at 90°C for 10 min. Extension primers (5 ± 2 pmol of each; ref. 23) and 0.6 U TMA31FS (Roche Molecular Systems) were added to the reaction. MgCl2 was adjusted to a final concentration of 3.0 mmol/L, Tris-SO4 to 20 mmol/L, and (NH4)2SO4 to 9 mmol/L and KCl to 6 mmol/L (pH∼8.2). Respective α-S-ddNTPs (30 μmol/L; Biolog) were used in a final reaction volume of 7 μL. An initial denaturing step of 2 min at 95°C was used, followed by 40 cycles of 3 s at 95°C and 5 s at 53°C. 0.5 μL of 0.5 mol/L acetic acid and 1.5 μL of phosphodiesterase II (3.4·10−3 U/μL, from calf spleen; Worthington Biochemical Corp.) were added to the reaction and incubated for 90 min at 37°C. A mixture of 11 μL acetonitrile, 2.75 μL 2 mol/L trimethylammonium-hydrogencarbonate buffer (pH 7.5), and 5.25 μL iodomethane was added and incubated at 40°C for 25 min. Upon cooling, a biphasic system was obtained. Ten microliters of H2O were added and left for 5 min at room temperature. Five microliters of the upper layer were taken off and diluted with 10 μL of 40% acetonitrile. A 1.5% solution of α-cyano-4-hydroxy-cinnamic acid methyl ester in acetone was used as the matrix for MALDI analysis. Preparation was carried out as thin layer preparation using the BasePlate robot. Matrix was spotted onto a bar-coded MALDI target, and the analyte (0.5 μL) was pipetted on top of the dry matrix using the BasePlate robot. Assays analyzing different genes, which were processed in parallel in a microtiter plate, were separated onto different stainless steel targets (Bruker Daltonik). Each sample was spotted eight times onto one target to compensate for statistical fluctuations. Targets were measured automatically in two Autoflex MALDI mass spectrometers (Bruker Daltonik) equipped with target-plate changing robots. Twenty times 10 shots were recorded on each preparation, whereas no >10 shots were allowed on the same spot of the preparation. The resolution (m/Δm) had to be >150, and the signal-to-noise ratio had to be >7. If >10 consequent spots on one preparation did not fulfill the requirements, the entire preparation was rejected. Calibration curves were established with mixtures of a known methylation degree as previously described ( 23) to compensate for various variables that might confound accurate and absolute quantification of CpG methylation levels such as preferential amplification of a certain methylation pattern during PCR or the sequence-specific annealing behavior of the extension primers. This assay has a quantitative resolution for the methylation degree of 5% and a detection limit of 5% for the minor allele. Samples with a mean methylation above 5% for the 6 CpG sites analyzed were scored as methylated, when dividing the tumor samples into methylated and unmethylated categories for statistical analysis.
Messenger RNA expression and siRNA analysis. The mRNA data used in this study has previously been published ( 22). Ratios of GSTP1 expression are represented as fold changes to the median abundance of the gene transcripts to a common reference in log2 space. Three cDNA clones for GSTP1 were present on the array. GSTP1 expression data are represented as mean values of log2-transformed data when data were available for more than one clone for each tumor sample. K562 cells were electroporated with 2 μmol/L siRNA as described in Berge and colleagues ( 24) using the following c-Myb–specific siRNA sequence 5′-GAAAUACGGUCCGAAA-CGUTT-3′ and the unrelated nonspecific control sequence 5′-AUUCUAUCACUAGCGUGA CUU-3′. Cells were harvested 24 h after transfection, RNA isolated, and analyzed on microarrays. 7
Electrophoretic mobility shift assay. DNA binding of c-Myb was performed as described previously ( 25). Protein-DNA complexes were formed at 25°C for 15 min in 20 mmol/L Tris-HCL, 10% Glycerol, and 1 mmol/L EDTA. Human c-Myb DNA binding domain residue 89 to 192 (R2R3) was used. The duplex oligonucleotides used were as follows: H haplotype (H Hap), 5′-GATGTCCCGGCGCGCCAGTTCGCCTCGAGGGCCGGCC-3′; E-Hap, 5′-GATGTCCCGGCGCGCCAGTAGCCTCGAGGGCCGGCC-3′; MRE-Mim1A control, 5′-GATGTCCCGGCGCAACCGTTAGCCTCGAG GGCCGGCC-3′. The E haplotype (E Hap) oligo is 1 bp shorter due to SNP T/- in position −288. The sequence of MRE-mim1A is based on the A-site Myb recognition element in the upstream region of the mim-1 gene.
Luciferase constructs. The minimal GSTP1 promoter (position −411 to +30) was amplified using forward primer 5′-ATACGCGTAAGCAATTTCCTTTCCTCTAAGC-3′ (MluI site is italicized) and reverse primer 5′-CGAGATCTCACTGGTGGCGAAGACTG-3′ (BglII site is italicized) from samples either homozygous for the E Hap or homozygous for the H Hap. The PCR products were cloned into pCR-Blunt II TOPO vector (Invitrogen). The promoter fragments were then cloned into pGL3 Basic vector (Promega) using the MluI and BglII sites. In vitro mutagenesis was performed using the Quick Change II Site Directed Mutagenesis kit (Stratagene) according to the manufacturer. The following mutagenesis primers were used for the E Hap: 5′-CCCCGCGATGTCCCGGCGCGCCTGTGCACACTTCGCTGCGGTC-3′ (forward) and 5′-GACCGCAGCGAAGTGTGCACAGGCGCGCCGGGACATCGCGGGG-3′ (reverse), and the H Hap: 5′-CCCCGCGATGTCCCGGCGCGCCTGCGCACACTTCGCTGCGGTC-3′ (forward) and 5′-GACCGCAGCGAAGTGTGCGCA GGCGCGCCGGGACATCGCGGGG-3′ (reverse).
Cell culture, transfections, and reporter assays. MCF-7 cells were maintained in RPMI 1640 with l-Glutamine (Invitrogen) with 10% fetal bovine serum, 1% penicillin, and streptomycin. ME16C cells were maintained in Mammary Epithelium Cell Growth Medium (PromoCell GmbH) containing 0.4% bovine pituitary extract, 10 ng/mL epidermal growth factor, 5 μg/mL insulin, 0.5 μg/mL hydrocortisone, 0.62 ng/mL phenol red, and 1% penicillin and streptomycin. Cells were grown in 5% CO2 at 37°C in a humidified atmosphere. MCF-7 and ME16C were plated with 150,000 and 75,000 cells, respectively, per well in 24-well plates. Transfections were done in triplicates. Two hundred nanograms of luciferase constructs were transfected per well. pRL-SV40 (4 ng) was cotransfected per well and used as an internal control to correct for cell numbers and transfection efficiency. Three microliters of FUGENE6 (Roche Diagnostics) were used per microgram of DNA. Luciferase assays were measured in a Turner 20/20 luminometer using the Dual Luciferase Assay Reporter Assay (Promega).
Statistical analysis. The genotyping data generated was recoded and subjected to statistical haplotype reconstruction analysis using PHASE v.2.1.1. Briefly, every polymorphic locus between position −513 and + 30 relative to the transcription start site was coded to show the nature of each position (1, A; G, 2; C, 3; T, 4; 5, deletion; 6, unknown). Phase was run in multiallelic mode with default options and extended permutations (−x10 and −X10). Two haplotypes with known repeat length but for which no sequence information could be retrieved were omitted from the analysis (haplotypes N and M). χ2 and Pearson correlation analysis were performed using Statistical Package for Science version 12.0.1. Complete-linkage hierarchical clustering analysis of methylation data were performed using the Cluster program and the results were displayed in TreeView. 8 Methylation data were median centered for each CpG site before clustering the tumor samples. Differences in distribution of methylation between patient groups categorized according to haplotype or various clinical and molecular variables were assessed using Mann-Whitney Rank test.
To investigate which factors affect the level of methylation, we applied multiple regression analysis using all possible combinations of TP53 mutation status, HER2 status, estrogen receptor (ER) status, progesterone receptor (PR) status, EE versus EN/NN haplotypes, represented by x1,x2,x3,x4,x5, as explanatory variables. y indicates the variable for methylation. If the level of methylation could be explained by all explanatory variables, the multiple regression model would be represented bywhere a0 indicates the mean of y, a1,…,a5, are the coefficients of x1,…,x5, respectively, and ε obeys the normal distribution with mean 0 and unknown variance. The coefficients of the model are estimated by the least squares method that yields the variance σ2 of the residual part ε. Because there are many possible combinations of coefficients, we need to evaluate which combination of the explanatory variables is the most suitable to model the variation in DNA methylation. For that, we calculate the Akaike Information Criterion (AIC; ref. 26) represented bywhereby the maximum log-likelihood can be approximated using the variance σ2, and k indicates the number of parameters of the model. AIC gives an evaluation for model selection, which is modified by a penalty increasing with the number of variables of the model. We analyzed all combinations of variables and selected objectively the model that fitted best to the data indicated by a minimal value for the AIC.
Methylation analysis. The methylation status of six CpGs spanning the transcription start site of the GSTP1 promoter in positions −22, +8, +14, +38, +47, and +55 (relative to the transcription start site) was quantitatively determined. For each CpG dinucleotide analyzed, the extent of methylation varied between 0% and 88% (Supplementary Table S1). This high-resolution quantitative analysis revealed the presence of methylation in 71% of the tumors. Six normal breast tissue samples were analyzed and found to be unmethylated with an overall degree of methylation of 3.7%. A large degree of variability between the tumors from different individuals was observed with the CpG closest to the transcription start site being methylated in 96% of the tumors. A correlation between the degree of methylation of the individual CpGs and their neighboring CpGs was observed with P values of <10−6 with the CpGs closest to each other being in strongest correlation (Supplementary Table S2). The methylation status of the CpGs in positions +8, +14, and +38 was most strongly correlated to the level of GSTP1 mRNA expression as shown by comparison with the values extracted from previously published microarray analysis (Supplementary Table S2). No correlation was found between age at time of diagnosis and the levels of DNA methylation. We observed a difference in degree of GSTP1 promoter methylation in samples assigned to different molecular subgroups as determined by expression profiling. The methylation of the GSTP1 promoter was significantly lower in the ER-negative basal and normal-like tumors in addition to one ER-negative ERBB2+-like tumor compared with the ER-positive luminal-like and ERBB2+-like tumors, which showed higher than median methylation (P = 0.002; Fig. 1A ). No difference in methylation of GSTP1 was observed between ductal and lobular carcinoma in this data set. Patients with methylated GSTP1 in their tumors had a better overall (P = 0.00063) and relapse-free survival (P = 0.014; Fig. 1B). When stratified for p53 mutations, HER2 status, ER status, PR status, grade, tumor stadium, lymph node status, and distant metastasis, GSTP1 methylation was associated with a good prognosis.
Sequence analysis, haplotypes, and methylation. The ATAAA repeat (from position −582 to −242) of the GSTP1 promoter was studied by heteroduplex analysis. We sequenced the region between bases −392 and +12 relative to the transcription start site in the samples analyzed for DNA methylation. We identified four SNPs in positions −354 (rs17593058), −288 (rs11311625), −287, and −282. We performed statistical reconstruction of haplotypes for the region between positions −513 and +30. Thirty-three possible haplotypes were identified with the most frequent haplotypes at 42.2%, 36.7%, and 6.0%. Eight haplotypes were estimated to have a frequency above 1% ( Fig. 2 ). Based on the possible haplotypes, the most likely haplotype pair was assigned to each tumor sample. A total of 14 different haplotypes were reconstructed in the 90 breast tumors. Two haplotypes with known repeat length were not reconstructed (C and D) and one haplotype (G) was constructed with a two-base deletion not in agreement with the heteroduplex data. The four SNPs downstream the ATAAA repeat grouped in two haplotypes in the tumor samples with haplotype E in complete linkage-disequilibrium with the shortest ATAAA repeat length (Supplementary Fig. S1; Fig. 2). Neither the haplotypes nor individual SNPs showed any association with lobular or ductal carcinoma or any other clinical and histopathologic characteristics of the tumors. The haplotypes were not associated with an increased risk for breast cancer in a case control study of 128 cases and 109 controls (data not shown).
The effect of these different haplotype structures of the GSTP1 promoter in relation to promoter methylation was analyzed. Linear regression showed a significant inverse correlation between the repeat length and the extent of DNA methylation in the GSTP1 promoter (P = 0.018; R2 = −2.05). Promoters with the homozygous genotype for the E Hap were more often methylated at a higher level when compared with the heterozygous haplotype or haplotype combinations where the E Hap was not present at all (P = 0.031; Table 1A ). Hierarchical clustering was used to group the samples according to the median methylation level. Tumors homozygous for the E Hap were mostly methylated above median, whereas the other possible haplotype combinations showed similar distribution between the two groups (P = 0.014; Table 1B). Using a nonparametric ranking test, we observed a significant difference in the distribution of DNA methylation in tumors with the EE haplotype combination compared with those with EN or NN haplotype-combinations (P = 0.011). The association was found more pronounced in ER-positive tumors (P = 0.005), PR-positive tumors (P = 0.008), and tumors with mutation in TP53 (P < 0.0001; Fig. 3 ).
To study the contribution of both haplotypes and molecular variables studied above on the levels of methylation, we constructed multiple linear regression models with 1 to 5 varying number of variables.
The model that fitted best the experimental findings included only the ER status and the EE haplotype. When only one variable was taken into account, the EE haplotype model fitted significantly better than the ER status (Supplementary Table S3).
Functional studies of the SNPs in position −288 and −287 and the effect on GSTP1 transcription. We used the TESS transcription factor database 9 to search for Transcription Factors (TF) that bound either of the alleles created by the SNPs in position −354. −288, −287, and −282. Four TFs with a score higher than 10 were predicted to bind sequences affected by the SNPs (Supplementary Table S4). A putative c-Myb response element CAGTTC was present in the H Hap, which was correlated with the highest DNA methylation levels as described above. In the E Hap, this element is transformed into CAG_TA by the deletion of T and a substitution C to A in positions −288 and −287, potentially modifying the putative binding of c-Myb. A trend toward correlation of c-Myb and GSTP1 mRNA expression was observed in tumors with GSTP1 methylation below 10%, but this did not reach statistical significance (P = 0.103). In vitro, Electrophoretic Mobility Shift Assay (EMSA) analysis showed that c-Myb indeed bound to CAGTTC, but this binding element was weakened by the two polymorphisms resulting in the sequence CAGTA ( Fig. 4A ). The two 441-bp minimal GSTP1 promoter haplotypes E Hap and H Hap were cloned in a pGL3basic vector and expressed in MCF-7 and ME16C breast cancer cell lines ( Fig. 4B). In the basal-like cell line ME16C, both haplotypes E and H were equally active without induction of c-Myb. Both haplotypes proved responsive to c-Myb induction and deletion of the −288/287 binding site in the presence of c-Myb reduced the activity of both alleles to basal levels. In the luminal-like cell line MCF7, the E Hap displayed significantly increased transcriptional activity compared with the H Hap before induction with c-Myb. These haplotypes also responded differently when cotransfected with c-Myb. The E Hap activity increased slightly compared with the levels measured in the absence of c-Myb, whereas the expression of the H Hap remained at a basal level upon cotransfection with c-Myb. Deletion of the −288/287 binding site abrogated transcriptional activity for the E Hap.
To assess the question whether c-Myb is involved in the expression of GSTP1, we investigated the expression patterns of genes affected by knockdown of c-Myb by c-Myb specific siRNA in K562 cells. 10 In this expression data set with eight biological replicas each analyzed with two technical replicas, we observed a consistent pattern of GSTP1 down-regulation after c-Myb knockdown ( Fig. 4C) compared with expression in the reference treated with an unrelated siRNA. The change is not dramatic (implying other factors being involved) but significant (P = 0.01) and very consistent with 8 of 8 experiments showing the same effect.
Hypermethylation of CpG islands is a common event in human cancers, but the underlying mechanism how this occurs is to a large extent unknown. Especially interactions between the genome and epigenome are still poorly understood. Local sequence features have been postulated to contribute to the susceptibility of a CpG island to become methylated or to be protected from methylation, respectively ( 27). In this study, we provide a possible mechanism for the extent of methylation in the promoter of the detoxification enzyme GSTP1.
Previously, the hypermethylation of the GSTP1 promoter has mainly been studied by methylation-specific PCR (MSP) and was found to be methylated in 24% to 31% of breast tumors ( 14, 28). We found that the GSTP1 promoter was methylated in 71% of the breast tumors analyzed. The high prevalence of methylation could be explained by the fact that the tumors analyzed were all selected stage III carcinomas. MALDI mass spectrometry features a high quantitative resolution and permits in contrast to MSP also to detect partially methylated molecules as amplification is performed irrespective of the methylation status, whereas MSP and other (fluorescent) real-time MSP methods specifically amplify a single methylation pattern—usually consistently methylated molecules. A strong correlation between methylation of each individual CpG and its neighboring CpGs was observed. A similar finding was reported in prostate cancer using the same method ( 23). The significant inverse correlation between GSTP1 mRNA level and promoter methylation may have a physiologic significance because strong correlation between mRNA and protein expression has been shown for GSTP1 in breast cancer cell lines ( 13) and prostate cancer ( 12). In the tumors of the doxorubicin-treated patients studied here, the methylated status of the GSTP1 promoter was associated with better overall (P = 0.00063) and disease-free survival (P = 0.014). Low activity of GSTP1 resulting from promoter hypermethylation may increase the effective therapeutic dose of the pharmacologic agent due to lower conjugation and inactivation of the drug. This is in concordance with previous reports where the absence of GSTP1 protein expression correlated with improved survival in invasive breast cancer samples ( 10, 29) and colorectal cancer ( 11). In contrast, other studies related low GSTP1 expression or GSTP1 hypermethylation to decreased survival and poor prognosis ( 15, 30). The difference between the results might be attributable to the molecular diversity in human breast cancer as well as differences in the received adjuvant therapy. The association to survival observed here was still valid after stratifying for p53 mutation status, HER2 status, ER status, PR status, grade, and tumor stadium. Larger confirmatory studies will be necessary to confirm this observation as after stratification numbers become low for some clinical categories. Ductal and lobular carcinomas were not differentially methylated and methylation can therefore not explain the differences in survival between the two cancer types. Although our data and that of others suggest that methylation of GSTP1 is a tumor-specific event, the stage of tumor formation at which it occurs cannot be addressed in the present study, as our samples are selected stage III (locally advanced) tumors.
We have previously reported the complex nature of the ATAAA repeat in healthy control individuals ( 4). Here, we report 33 possible haplotypes for the GSTP1 promoter and 14 of these were found in tumor DNA from breast cancer patients. In a recent report, 10 polymorphisms in the GSTP1 promoter were assembled into haplotypes whereby the two most frequent haplotypes were reported with similar frequencies to those here ( 31). Interestingly, Cauchi and colleagues ( 31) reported differential inducibility of the different GSTP1 haplotypes by exposure to environmental agents and recommended further studies to assess the functionality of these haplotypes. The differential susceptibility to DNA methylation of different haplotypes might provide an explanation for these data.
To our knowledge, we are the first to report a possible connection between genetic variation in the promoter region of a gene and degree of DNA methylation in vivo. Clearly the genetic background cannot be the only factor determining the extent of methylation on a promoter and molecular events during tumorigenesis of higher rank will be key determinants. We therefore compared the methylation levels of the patients with different GSTP1 haplotypes (EE versus EN/NN) within various molecular subgroups such as ER, PR status, and TP53 mutation by both stratification and linear regression modeling. In the stratified analysis, we observed a more pronounced difference in the extent of methylation between the EE and EN/NN haplotypes in patients who were PR+, ER+, and TP53mut. When integrated into a linear regression model, only the ER status interaction remained significant in addition to the haplotype and the effect of the haplotype was the main factor associated with levels of methylation. It has been speculated that the ATAAA repeat functions as a boundary element preventing DNA methylation from spreading into the promoter of GSTP1 ( 18). However, deletion of the ATAAA repeat from an artificial shuttle vector did not induce any de novo DNA methylation in prostate cancer cell lines ( 19). We found a significant correlation between the (ATAAA)n repeat length and the degree of GSTP1 promoter methylation (P = 0.022). Individuals homozygous for the haplotype with the shortest repeat length were found to be more methylated than other haplotype combinations (P = 0.031), and they are 5 times more likely to have DNA methylation levels above median (P = 0.014). These findings might be simply explained by the distance the methylation has to spread to gain access to the promoter region. What is causing this boundary to break down in cancer is largely unknown but a combination of transcriptional gene silencing ascribed to a putative loss of Sp1 binding sites and seeds of methylation that act as catalysts have been proposed.
Because active transcription may play a role in impeding de novo DNA methylation, we sequenced the GSTP1 minimal promoter in all tumor samples to look for genetic variations that could abrogate the binding of transcriptional activators or repressors identifying four SNPs. In silico analysis identified a binding site for c-Myb overlapping two of the variants. c-Myb is a transcription factor required for self-renewal of immature hematopoietic cells and also for the proliferation and correct development of various tissue-like smooth muscle cells and colonic epithelium tissue ( 32). It has been shown that the c-Myb protein is only expressed at detectable levels in ER-positive breast cancer cell lines ( 33) and breast tumors ( 34).
The in vitro experiments confirmed a possible role of c-Myb in transcriptional regulation of GSTP1 as less efficient binding of c-Myb was found when the respective binding sites were changed by the SNPs. The H Hap bound c-Myb better than the E Hap. Interestingly, the H allele is part of the haplotype that has a significantly lower degree of methylation suggesting that c-Myb binding to the H Hap could lead to less frequent methylation of this haplotype due to a more active transcription. This is in agreement with the theory of active transcription ( 35) where active transcription prevents DNA methylation. It is expected that binding differences observed in a favorable pure in vitro system will be more pronounced in a competitive in vivo situation. The two GSTP1 promoter haplotypes responded differently upon c-Myb induction in basal-like (ER negative) and luminal-like (ER positive) breast cancer cell lines. In the ME16C cell line, the two haplotypes performed similarly with the H Hap being the most active. This is in accordance with the EMSA experiments. In the MCF-7 cells, the E Hap is more active than the H Hap, and a small but significant direct c-Myb induction is seen only for the E-allele but not for the H-allele. Although Myb has been shown to be positively regulated by ER, this is probably not the only regulatory factor. If this was the case, we would probably not have seen a c-Myb induction in the ER-negative cells. c-Myb in turn is certainly not the only TF involved in the regulation of GSTP1, and our current interpretation is that Myb contributes to a methylation pattern together with other factors. How this occurs in detail needs further analysis. Because of the differences seen between the two cell lines, it is possible that c-Myb is in need of another protein partner to have effect on the GSTP1 promoter that is not present in MCF-7 but is present in ME16C. Alternatively, c-Myb itself may activate another transcription factor that is only inducible in the basal-like ME16C and not in the luminal-like MCF-7 cells. c-Myb-specific knockdown by siRNA led to decreased expression of GSTP1, showing that c-Myb may have a direct or indirect role in GSTP1 activation. Differences in transcriptional profiles in luminal and basal-like cell lines have been reported ( 36). Further studies in breast cancer cells are necessary to elucidate the discrete mechanisms of transactivation of GSTP1 and the TFs involved. Our results indicate that haplotype structure of a promoter sequence may be important for the extent of DNA methylation where the ATAAA repeat in combination with SNPs in certain haplotypes conferring differential transcriptional activity could lead to differences in de novo DNA methylation in breast tumors. This study is, to our knowledge, the first documented example of a connection between genetic polymorphisms that influence promoter activity that might in turn determine the extent of the epigenetic variation in tumorigenesis. Linear regression models pointed to the haplotype as the main determinant of methylation levels in the GSTP1 promoter. Further research is needed to determine if this interaction constitutes a single case or presents the first example of a biological mechanism of great importance applicable to a wide variety of (tumor suppressor) genes.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
Grant support: D-03067 from The Norwegian Cancer Society (V.N. Kristensen), grant 152004/150 from The Functional Genomics Program of the Norwegian Research Council (V.N. Kristensen), and the French-Norwegian cooperation program Aurora grant 15842WE (J.A. Rønneberg, J. Tost, and V.N. Kristensen).
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
We thank Dr. Ida Bukholm for providing normal breast tissue.
Note: Supplementary data for this article are available at Cancer Research Online (http://cancerres.aacrjournals.org/).
J.A. Rønneberg is a fellow at the Faculty Division The Norwegian Radium Hospital, University of Oslo.
↵7 E.M. Brendeford et al., in preparation.
↵10 E.M. Brendeford et al., in preparation.
- Received October 11, 2007.
- Revision received April 24, 2008.
- Accepted May 7, 2008.
- ©2008 American Association for Cancer Research.