Chromosomal instability (CIN) is associated with poor prognosis in human cancer. However, in certain animal tumor models elevated CIN negatively impacts upon organism fitness, and is poorly tolerated by cancer cells. To better understand this seemingly contradictory relationship between CIN and cancer cell biological fitness and its relationship with clinical outcome, we applied the CIN70 expression signature, which correlates with DNA-based measures of structural chromosomal complexity and numerical CIN in vivo, to gene expression profiles of 2,125 breast tumors from 13 published cohorts. Tumors with extreme CIN, defined as the highest quartile CIN70 score, were predominantly of the estrogen receptor negative (ER−), basal-like phenotype and displayed the highest chromosomal structural complexity and chromosomal numerical instability. We found that the extreme CIN/ER− tumors were associated with improved prognosis relative to tumors with intermediate CIN70 scores in the third quartile. We also observed this paradoxical relationship between CIN and prognosis in ovarian, gastric, and non–small cell lung cancer, with poorest outcome in tumors with intermediate, rather than extreme, CIN70 scores. These results suggest a nonmonotonic relationship between gene signature expression and HR for survival outcome, which may explain the difficulties encountered in the identification of prognostic expression signatures in ER− breast cancer. Furthermore, the data are consistent with the intolerance of excessive CIN in carcinomas and provide a plausible strategy to define distinct prognostic patient cohorts with ER− breast cancer. Inclusion of a surrogate measurement of CIN may improve cancer risk stratification and future therapeutic approaches. Cancer Res; 71(10); 3447–52. ©2011 AACR.
Chromosomal instability (CIN) results in numerical and structural chromosomal complexity and is associated with poor prognosis in solid tumors (1, 2), and the acquisition of phenotypic variation promoting drug resistance in yeast models (3). In contrast, in mammalian cells and yeast models, aneuploidy may have a negative impact upon organism fitness and proliferation (4, 5). Indeed, CIN may confer both a tumor promoting and tumor suppressive function in animal model systems (6, 7). Excessive CIN can be introduced in model systems by inactivation of mitotic spindle checkpoint components; this results in gross aneuploidy and cell death (8). Accordingly, elevation of the frequency of chromosome missegregation has been proposed as a strategy to kill tumor cells (8, 9).
Such an opposing relationship suggests that there may be an optimal level of CIN for tumor progression, beyond which further instability provides no growth advantage that may even be deleterious for cancer cell survival (10) through an evolutionary scenario analogous to “mutational meltdown” in bacteria (11) or “error catastrophe” in viruses (12). However, it is not known whether CIN, over a certain threshold, may impact negatively on human tumor growth, or whether very high levels of CIN in human tumors might be associated with improved patient prognosis relative to intermediate levels. If such a relationship exists, it may have important implications for risk stratification and for future therapeutic approaches directed against CIN tumors.
The CIN70 expression signature was derived from a surrogate measure of CIN and is defined as the average expression of 70 genes that correlate with “total functional aneuploidy” in solid tumors (1). Here we demonstrate that the CIN70 expression signature correlates with both structural chromosomal complexity and numerical CIN, and we use this signature to address the relationship between CIN and outcome in cancer.
Materials and Methods
Gene expression data sets
We obtained raw microarray expression data for 13 publicly available breast cancer cohorts (13–23) and GSE2109 and GSE16446, representing 2,125 individual patients. Additionally, we obtained gene expression data from 3 ovarian cancer cohorts (24–26), 2 squamous non–small cell lung cancer (NSCLC) cohorts (25, 27) and 1 gastric cancer cohort (28).
Duplicate patients were present in some cohorts, but were removed from the analysis. Censored recurrence-free or metastasis-free survival data were available for 1,168 breast cancer patients from 7 cohorts. Estrogen receptor (ER) and erythroblastic leukemia viral oncogene homolog 2 (ERBB2) status was inferred by k-medoids clustering of the expression levels of the ESR1 and ERBB2 genes, an approach that correlates well with histological assessment (29). CIN70 scores were calculated as the mean expression of the 70 probe sets matching the 70 genes of the CIN70 signature as described (1). For analysis of combined cohorts, CIN70 scores were first normalized within each cohort by centering the values, and then dividing by the standard deviation. All breast cancer samples in the combined cohorts were stratified into CIN70 quartiles according to the normalized CIN70 scores. These groups were fixed, and were not re-defined for analysis of specific subtypes. Thus, in analysis of specific subtypes, each quartile does not necessarily contain a quarter of the samples. Similarly, for nonbreast cancer cohorts, CIN70 quartiles were defined on the basis of all tumors of the given cancer type.
SNP-based measurements of CIN
Publicly available SNP data based on the Affymetrix 100 k platform (30), representing 281 breast tumor specimens with paired expression data (23), were acquired from GEO. We determined the genome integrity index (GII) as described (31). To determine the total number of DNA breakpoints, we counted the number of DNA segments with an inferred log2 ratio of greater than 0.3 or less than −0.3. The total number of LOH regions was inferred from regions of allelic imbalance (AI) as described (32). To determine a combined GII/copy number/LOH score, we linearly transformed each set of scores such that its values ranged between 0 and 1, with no aberrations/GII being assigned to 0, and the highest number of aberrations/highest GII score in our cohort assigned a value of 1. We defined the combined aberration score as the mean of the 3 transformed scores.
For the breast cancer cohorts and the gastric cancer cohort, survival analysis was performed with time to relapse or, if not available, time to distant metastasis as outcome variable. For the ovarian and squamous lung cohorts, the outcome variable was overall survival. Survival curves were calculated using the Kaplan–Meier method. For univariate Cox regression, significance was estimated with a log rank test. For multivariate regression, significance was estimated using a chi-squared test. In the meta-analyses, the summary estimate was calculated as a weighted average of the individual estimates, where summary weights were calculated as the reciprocal of the variance defined as the squared standard errors. All data analysis and statistics was performed in the R statistical environment version 2.11. All P values are 2 sided. Full methods are available in the Supplementary Methods section.
CIN70 signature is a surrogate measure of structural chromosomal complexity and numerical CIN in vivo
To assess structural chromosomal complexity, we used a published cohort of 281 primary breast cancer tumors, 271 of which were profiled with paired SNP and mRNA expression arrays (23, 30). From the mRNA data, we calculated the CIN70 score. From the SNP data, we calculated 3 summary measures of structural chromosomal complexity: (i) the GII, a measure of the proportion of the cancer genome subject to copy number aberrations (31); (ii) the total number of DNA copy number changes; and (iii) the total number of genomic regions with AI (32), a measure correlated with loss of heterozygosity (LOH). Each of the 3 measures was positively and significantly correlated with the CIN70 score (Supplementary Fig. S1, a–c). We combined these 3 DNA-based measures into a single “structural chromosomal complexity” measurement (see the Supplementary Methods section), which was more strongly correlated with the CIN70 score than any of the individual measures (Supplementary Fig. S1d). We then stratified patients into CIN70 score quartiles and found that the tumors with the highest CIN70 score quartile displayed the highest mean structural chromosomal complexity (Fig. 1A), both in all tumors (P = 1.8 × 10−6, t test) and in ER−/ERBB2− tumors only (P = 0.0002).
To address the relationship between the CIN70 score and numerical CIN, we analyzed a cohort of 44 breast cancers with paired DNA image cytometry and gene expression measurements (33) and stratified patients into CIN70 score quartiles. DNA image cytometry provides 2 relevant readouts: DNA index as a proxy for DNA ploidy; and stemline scatter index, discriminating genomically stable from genomically unstable tumors, providing a measure of numerical CIN (33). We found only a borderline significant enrichment of high DNA index in the highest CIN70 quartile (P = 0.08, t test; Fig. 1B). In contrast, we found a strong enrichment for genomically unstable tumors in the highest CIN70 quartile (OR = 15.4, P = 0.0006, Fisher's exact test; Fig. 1C), indicating that the CIN70 score more closely reflects numerical CIN than ploidy status. Together, these results demonstrate that the CIN70 score reflects both structural chromosomal complexity and numerical CIN. Furthermore, the greatest CIN was observed in the highest CIN70 score quartile, which we defined as CINextreme.
CIN70 distribution in breast tumor subtypes
To assess whether higher-risk breast cancer subtypes exhibit increased levels of tumor CIN, we investigated the distribution of the CIN70 score in ER+, ER−/ERBB2−, and ERBB2+ breast cancer subtypes in a pooled analysis of 2,125 patients from 13 publicly available microarray expression data sets. As ER and ERBB2 status were not available for all samples, we inferred ER and ERBB2 subtype from gene expression data using published methods (29). In general, ER+ tumors displayed the lowest average CIN70 score, whereas ER−/ERBB2− tumors displayed the highest average and the broadest distribution of CIN70 scores (Fig. 2A). The majority of ER−/ERBB2− tumors occur in the CINextreme, highest CIN70 score quartile, whereas ER+ breast cancers encompass the lower CIN70 score quartiles (Fig. 2B). We also analyzed CIN70 scores in tumors stratified by intrinsic subtype and noted higher scores in basal-like tumors (P < 10−16, Supplementary Fig. S2a and b; ref 34).
Nonmonotonic relationship between CIN70 score and prognosis in ER−/ERBB2− breast cancer
As we previously reported (1), a CIN70 score higher than the median was associated with significantly poorer outcome when considering all patients (Supplementary Fig. S3). However, we hypothesized that if animal and eukaryotic cell models describing intolerance of high levels of CIN were relevant to human breast cancer (4, 5, 8), then excessive CIN, determined by the CIN70 expression signature in the CINextreme quartile, might be associated with improved clinical outcome in breast cancer patient cohorts. Such an effect would manifest as a nonmonotonic relationship, showing first increasing then decreasing risk of relapse as a function of increasing CIN70 score. ER−/ERBB2− breast cancer is considered a high-risk subtype and displays both the greatest range of CIN70 scores and the highest frequency of CINextreme tumors, providing a relevant patient cohort to address whether CINextreme might be associated with improved prognosis relative to tumors with intermediate CIN.
In the combined ER−/ERBB2− cohort of 265 patients, we observed better clinical outcome for patients with CINextreme tumors (highest CIN70 score quartile) compared with patients with tumors in the third CIN70 score quartile (Fig. 3A, HR = 0.55, P = 0.021, log-rank test). To confirm that this finding was not an artifact of the combined cohort, we also performed a meta-analysis of the individual data sets, with a similar result (Supplementary Fig. S4, HR = 0.51, P = 0.017). When the cohorts were separated into those patients treated with and without adjuvant therapy, the same relationship was observed, indicating that improved outcome in CINextreme tumors appears to be independent of treatment (data not shown). In a multivariate analysis of ER−/ERBB2− patients including age, grade, nodal status, size, and CINextreme versus third CIN70 score quartiles, CINextreme was a significant predictor of improved recurrence-free survival (HR = 0.40, P = 0.03, Supplementary Table S1). This difference in predicted relapse-free survival between tumors in the CINextreme cohort compared with tumors in the third CIN70 score quartile was not detected with Adjuvant!Online (Supplementary Fig. S5, P = 0.60, t test), indicating that standard histopathological approaches do not distinguish these 2 groups.
CINextreme is associated with improved prognosis in ovarian cancer, squamous NSCLC, and gastric adenocarcinoma
To investigate whether the improved survival observed for patients with CINextreme tumors was specific for ER−/ERBB2− breast cancers or was a more general phenomenon in epithelial carcinomas, we acquired data from 3 ovarian carcinoma cohorts, 2 squamous NSCLC cohorts, and a gastric cancer cohort (see Supplementary Methods section). In the ovarian carcinoma meta-data set, CINextreme tumors were associated with improved recurrence-free survival relative to tumors in the third CIN70 score quartile (Fig. 3B), as observed in the breast carcinoma cohorts. A similar relationship was observed in squamous NSCLC, although CINextreme was associated with improved prognosis relative to the 2 middle CIN70 score quartiles (Fig. 3C). In the gastric cancer cohort, CINextreme was associated with similar prognosis to the 2 lowest CIN quartiles, and the third CIN quartile was associated with the poorest prognosis (Fig. 3D).
When we specifically assessed the HR for relapse for each of the 4 CIN quartiles relative to patients in the remaining 3 quartiles in a meta-analysis of the 1,297 patients across the 4 epithelial carcinoma subtypes (ER−/ERBB2− breast, ovarian, squamous NSCLC, and gastric cancer), the CINextreme quartile was associated with a significantly improved HR (HR = 0.70, P = 0.0001), in contrast to the third CIN70 score quartile carcinomas which were associated with a significantly worse HR for relapse (HR = 1.32, P = 0.001; Fig. 4; Supplementary Fig. S6a–d). This result is consistent with a nonmonotonic relationship between CIN and outcome, with carcinomas in the CINextreme score quartile having the best prognosis, whereas carcinomas in the intermediate third CIN70 score quartile are associated with the poorest clinical outcome.
Our results based on the CIN70 signature support the hypothesis that although genomic instability improves cancer cell biological fitness and may impact adversely on prognosis, excessive genomic instability may surpass a threshold compatible with cell viability (10). This suggestion is supported by evidence that aneuploidy in eukaryotic cell systems negatively impacts upon cell biological fitness (4, 5) and excessive CIN may induce cell autonomous lethality (8, 9). Given the nonmonotonic nature of the relationship between CIN expression and prognosis, therapeutic strategies to modulate genomic instability may provide a rational approach for future drug development (8, 9). We have presented evidence that the CIN70 signature correlates with both structural chromosome complexity, an expected finding due to the derivation of this signature from a surrogate measure of CIN known as “total functional aneuploidy” (1), and numerical CIN measured by DNA image cytometry, and thus may serve as a stratification tool for genome instability within clinical cohorts.
Development of robust methods to predict patient outcome in ER− breast cancer is of paramount importance to enhance personalized treatment stratification approaches in this disease. Prognostic expression signatures in ER+ breast cancer such as those underlying the Mammaprint and Oncotype DX tests have been shown to predict tumor genome instability status (33) but have limited prognostic value in ER− breast cancer (35). Our results derived from a plausible biological hypothesis, demonstrating a nonmonotonic relationship between CIN70 expression and clinical outcome, begin to define an ER− patient cohort with extreme CIN associated with good prognosis and a cohort with intermediate CIN associated with the worst prognosis, independent of standard histopathological criteria. This nonmonotonic relationship between expression and survival provides a basis for the difficulties encountered in applying “prognostic” gene expression signature sets, which also predict genome instability status, to ER− breast cancer outcome (35).
In summary, the classification of tumor CIN status might be considered for incorporation into new prognostic models in cancer to validate this approach and form the basis for novel treatment strategies to improve patient prognosis and clinical outcome directed against this pattern of genome instability.
Disclosure of Potential Conflicts of Interest
The contents of this article are solely the responsibility of the authors and do not necessarily represent the official views of the NIH.
C. Swanton is a senior Medical Research Council (MRC) clinical research fellow. This work was funded by CR-UK, MRC, NIH (grants NCI SPORE P50 CA 89393, R21LM008823-01A1) and by the Breast Cancer Research Foundation. N.J. Birkbak was funded by the Danish Council for Independent Research-Medical Sciences (FSS). Some results are in whole or part based upon data generated by The Cancer Genome Atlas Pilot Project established by the NCI and NHGRI (24).
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Note: Supplementary data for this article are available at Cancer Research Online (http://cancerres.aacrjournals.org/).
- Received October 12, 2010.
- Revision received December 3, 2010.
- Accepted January 4, 2011.
- ©2011 American Association for Cancer Research.