| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
Cell, Tumor, and Stem Cell Biology |
1 UMR599 Inserm, Institut Paoli-Calmettes, Laboratoire d'Oncologie Moléculaire, Centre de Recherche en Cancérologie de Marseille, 2 Département de BioPathologie, 3 Centre de Ressources Biologiques, 4 Département d'Oncologie Médicale, Institut Paoli-Calmettes, 5 Faculté de Médecine, Université de la Méditerranée, and 6 Département d'Anatomopathologie, Hôpital Nord, Marseille, France
Requests for reprints: Daniel Birnbaum, UMR599 Inserm, 27 Bd. Leï Roure, 13009 Marseille, France. Phone: 33-4-91-75-84-07; Fax: 33-4-91-26-03-64; E-mail: birnbaum{at}marseille.inserm.fr.
| Abstract |
|---|
|
|
|---|
| Introduction |
|---|
|
|
|---|
500 genes, are variably associated with different histologic types and with different prognosis. Luminal A breast cancers, which express hormone receptors, have an overall good prognosis and can be treated by hormone therapy. ERBB2-overexpressing breast cancers, which overexpress the ERBB2 tyrosine kinase receptor, have a poor prognosis and can be treated by targeted therapy using trastuzumab or lapatinib (6, 7). No specific therapy is available against the other subtypes, although the prognosis of basal and luminal B tumors is poor. This biologically relevant taxonomy remains imperfect because clinical outcome may be variable within each subtype, suggesting the existence of unrecognized subgroups. Progress can be made in several directions. First, it is necessary to identify among good prognosis tumors, such as luminal A breast cancers, the ones that will relapse and metastasize. Second, a better definition of poor prognosis breast cancers and associated target genes will allow the development of new drugs that will in turn allow a better management of these cancers. We have here established the gene expression profiles of a series of breast tumor samples. To provide clues on both potential new prognostic and therapeutic targets, we have specifically focused our analysis on two major breast cancer subtypes with opposite prognosis (luminal A and basal) and on genes encoding protein kinases.
The human kinome constitutes about 1.7% of all human genes (8) and represents a great part of genes whose alteration contributes to oncogenesis (9). Protein kinases mediate most signal transduction pathways in human cells and play a role in most key cell processes. Some kinases are activated or overexpressed in cancers and constitute targets for successful therapies (10). In parallel to ongoing systematic sequencing projects (11), analysis of differential expression of kinases in cancers may identify new oncogenic activation pathways. As such, kinases represent an attractive focus for expression profiling in two important subtypes of breast cancer.
| Materials and Methods |
|---|
|
|
|---|
In addition, we profiled RNA extracted from eight cell lines that provided models for cell types encountered in mammary tissues: three luminal epithelial cell lines (HCC1500, MDA-MB-134, and ZR-75-30), three basal epithelial cell lines (HME-1, HMEC-derived 184B5, and MDA-MB-231), and two lymphocytic B- and T-cell lines (Daudi and Jurkatt, respectively). All cell lines were obtained from American Type Culture Collection7 and were grown as recommended.
Gene expression profiling with DNA microarrays. Gene expression analyses were done with Affymetrix U133 Plus 2.0 human oligonucleotide microarrays containing >47,000 transcripts and variants, including 38,500 well-characterized human genes. Preparation of cRNA from 3 µg total RNA, hybridizations, washes, and detection were done as recommended by the supplier.8 Scanning was done with Affymetrix GeneArray scanner, and quantification was done with Affymetrix GCOS software. Hybridization images were inspected for artifacts.
Gene expression data analysis. Expression data were analyzed by the Robust Multichip Average method in R using Bioconductor and associated packages (12). Before analysis, a filtering process removed the genes with low and poorly measured expression from the dataset as defined by expression value inferior to 100 units in all 227 breast cancer tissue samples, retaining 31,189 genes/ESTs.
Before unsupervised hierarchical clustering, a second filter excluded genes showing low expression variation across the 227 samples, as defined by SD inferior to 0.5 log2 units (only for calculation of SD, values were floored to 100 because discrimination of expression variation in this low range cannot be done with confidence), retaining 14,486 genes/ESTs. Data were then log2 transformed and submitted to the Cluster program (13) using data median centered on genes, Pearson correlation as similarity metric, and centroid linkage clustering. Results were displayed using TreeView program (13). Quality threshold clustering identifies sets of genes with highly correlated expression patterns among the hierarchical clustering. It was applied to the kinase probe sets and basal and luminal A tumors using TreeView program (13). The cutoffs for minimal cluster size and minimal correlation were 15 and 0.7, respectively. The gene clusters were interrogated using Ingenuity software to assess significant representation of biological pathways and functions.
Definition of kinase-encoding probe sets. The kinome database established by Manning et al. (8) was used as reference to extract the kinase-encoding genes from the Affymetrix Genechip U133 Plus 2.0. First, because annotation of the Human Genome Organization symbols did not correspond necessarily between the genes represented on the Affymetrix chip and the kinome, we used the mRNA accession number as cross-reference. cDNA sequences of the kinome were compared with the representative mRNA sequences of the Unigene database using BLASTn, and alignments between these sequences were obtained. All mRNAs with exact match were retained, and their accession numbers were compared with those of the 31,189 selected probe sets given by Affymetrix. Second, some kinase genes were represented by several probe sets on the Affymetyrix chip. This may introduce bias in the weight of the groups of genes for analysis by quality threshold clustering. In these cases, probe sets with an extension «_at», next «s_at», and followed by all other extensions were preferentially kept. When several probe sets with the best extension were available, the one with the highest median value was retained. From the initial list of 518 kinases, we finally retained 435 probe sets representing 435 kinase genes (Supplementary Table S2).
Collection of published datasets. To test the performance of our multigene signature in other breast cancer samples, we analyzed three major publicly available data sets: van de Vijver et al. (14),9 Wang et al. (15) collected from National Center for Biotechnology Information (NCBI)/Genbank GEO database (series entry GSE2034), and Loi et al. (16) collected from NCBI/Genbank GEO database (series entry GSE6532). Analysis of each data set was done in several successive steps: identification of molecular subtypes based on the common intrinsic gene set, identification of the kinase gene set common with ours, followed by computing of the kinase score (see below) for the luminal A samples. Clinical data of luminal A samples from our series and public series used for analyses are detailed in Supplementary Table S3.
Statistical analyses. We defined a score, called the kinase score, which was based on the expression level of 16 kinase genes. It was defined as
![]() |
The samples included in the statistical analysis (luminal A subtype) were estrogen receptor (ER) and/or progesterone receptor (PR) positive as defined by using immunohistochemistry. We introduced two qualitative variables based on the mRNA expression level of ER and PR (ESR1 probe set 205225_at and PGR probe set 208305_at): the cutoff for defining ESR1 or PGR rich or PGR poor was the median expression level of the corresponding probe set. The two probe sets were chosen by using the same above-cited criteria.
Correlations between sample groups and histoclinical factors were calculated with the Fisher's exact test for qualitative variables with discrete categories and the Wilcoxon test for continuous variables. Follow-up was measured from the date of diagnosis to the date of last news for patients without relapse. Relapse-free survival (RFS) was calculated from the date of diagnosis until date of first relapse, whatever its location (local, regional, or distant) using the Kaplan-Meier method, and compared between groups with the log-rank test. The univariate and multivariate analyses were done using Cox regression analysis. The P values were based on log-rank test, and patients with one or more missing data were excluded. All statistical tests were two sided at the 5% level of significance. Statistical analysis was done using the survival package (version 2.30) in the R software (version 2.4.1).10
| Results |
|---|
|
|
|---|
Whole-kinome expression profiling separates basal and luminal A breast cancers. We wanted to identify kinase genes whose differential expression is associated with clinical outcome. We focused our analysis on two major subtypes of breast cancer with opposite prognosis, the basal and the luminal A subtypes. From our subtyping, we selected a series of 138 breast cancer samples with available full histoclinical annotations, including 80 luminal A and 58 basal breast cancers. We identified a total of 435 unique Affymetrix probe sets for 435 kinases as satisfying simultaneously presence, quality, and reliability (Supplementary Tables S2 and S4). A hierarchical clustering analysis was applied to these probe sets and 138 breast cancers and 8 cell lines (Fig. 1A
). The tumors displayed heterogeneous expression profiles. They were sorted into two large clusters, which nearly perfectly correlated with the molecular subtype, with all but one of the basal breast cancers in the left cluster and all but one of the luminal A breast cancers in the right cluster (Fig. 1B). Visual inspection revealed at least four clusters of related genes responsible for much of the subdivision of samples into two main groups. They are zoomed in Fig. 1C. The first cluster was enriched in genes involved in cell cycle and mitosis. It was overexpressed in basal overall compared with luminal A tumors and in cell lines compared with cancer tissue samples. The second gene cluster included many genes involved in immune reactions. It was expressed at heterogeneous levels in both luminal A and basal tumors, and was overexpressed in lymphocytic cell lines compared with epithelial cell lines. The third and the fourth clusters were strongly overexpressed in luminal A overall compared with basal breast cancer samples. The third cluster included genes involved in transforming growth factor (TGF)β signaling as well as transmembrane tyrosine kinase receptors. Gene ontology analysis using Ingenuity software confirmed these data with significant overrepresentation (right-tailed Fisher's exact test) of the functions "cell cycle" (P = 4.6E–07) and "DNA replication, recombination, and repair" (P = 6.1E–05) in the first cluster, "immune response" (P = 8.1E–10) and "cellular growth and proliferation" (P = 8.1E–10) in the second cluster, and "tumor morphology" (P = 2.2E–04) and "nervous system development and function" (P = 2.3E–04) in the third cluster. Analysis of canonical pathways showed overrepresentation of "G2-M transition of the cell cycle" (P = 6.8E–08), "nuclear factor-
B signaling pathway" (P = 1.3E–04), and "TGFβ signaling" (P = 4E–03) in the first, second, and third clusters, respectively. No correlation was found between these gene clusters and the nine kinase families (AGC, CAMK, CK1, CMGC, RGC, STE, TK, TKL, and Atypical) or the chromosomal location of genes.
|
Kinase gene expression identifies two subgroups of luminal A breast cancers. As shown in Fig. 1, basal breast cancers constituted a rather homogenous cluster, whereas luminal A breast cancers were more heterogenous. Basal and luminal breast cancers were distinguished by the differential expression of clusters of genes. By using quality threshold clustering, we identified a single cluster of significance principally responsible for this discrimination (Fig. 1B), corresponding to the above-described first cluster. It contained 16 kinase genes (Table 1 ), which were overexpressed in all basal breast cancers and some luminal A samples, and underexpressed in most luminal A samples (Fig. 1B).
|
|
Characteristics and prognosis of the two subgroups of luminal A breast cancers. The histoclinical characteristics of the two luminal A subgroups are listed in Table 2 . Strikingly, they shared most features but were different according to Scarf-Bloom-Richardson (SBR) grade with more grade III in the Ab subgroup and more grade I to II in the Aa subgroup. Ki67 expression did not distinguish Ab from Aa cases but three-fourths of luminal Ab were Ki67-positive. In conclusion, no factor but grade could distinguish Aa from Ab breast cancers.
|
We then compared the prognostic ability of our kinase score–based classifier with other histoclinical factors (age, pathologic tumor size, SBR grade, and axillary lymph node status, immunohistochemistry P53 and Ki67 status, and ESR1 and PGR mRNA levels) in our 80 luminal A samples (Table 3A ). In univariate and multivariate Cox analyses, the only factor that correlated with RFS was the kinase score–based classifier. The hazard ratio for relapse was 7.77 for luminal Ab tumors compared with luminal Aa tumors [95% confidence interval (CI) 1.97–30.66; P = 0.003].
|
Samples form the three studies were pooled before prognostic analyses. Histoclinical correlations of the two subgroups were similar to those found in our series (Supplementary Table S6). We then compared RFS of the two luminal A subgroups in the 276 samples. With a median follow-up of 104 months after diagnosis, luminal Ab tumors were associated with a worse prognosis than luminal Aa tumors, with respective 5-year RFS of 90% and 73% (P = 6.3E–6, log-rank test; Fig. 2D). For comparison, 5-year RFS was 64% in basal samples in the three pooled series.
We also performed univariate and multivariate survival analyses (Table 3B). Wang et al's (15) series (79 Luminal A samples) was analyzed separately due to the lack of available histoclinical data. In univariate analysis, the hazard ratio for relapse was 4.84 for luminal Ab tumors compared with luminal Aa tumors (95% CI, 2.13–11.00; P = 1.7E–04). The two other series were merged for analyses (197 Luminal A samples). Three variables, including pathologic tumor size, PGR mRNA expression level, and kinase score–based subgrouping, were significantly associated to RFS in univariate analysis. In multivariate analysis, only the kinase score–based classifier retained significant prognostic value, confirming the prominence of the kinase score over the SBR grade and other variables. The hazard ratio for relapse was 2.48 for luminal Ab tumors compared with luminal Aa tumors (95% CI, 1.37–4.50; P = 0.002).
Kinase score and molecular subtypes. We then studied the association of the kinase score with the intrinsic molecular subtypes. We merged all data sets, including our 227 tumors, the 295 van de Vijver et al's (14) tumors, the 414 Loi et al's (16) tumors, and the 286 Wang et al's (15) tumors, resulting in a total of 1,222 tumors. The kinase score and molecular subtypes were determined for all tumors: 367 tumors were luminal A, 99 luminal B, 172 ERBB2-overexpressing, 214 basal, 161 normal-like, and 209 unassigned. We computed and compared the distribution of the kinase score in each subtype. As shown in Fig. 3A , most of the luminal A and normal-like tumors had negative kinase score, whereas most of the basal and luminal B tumors had positive kinase score. All pairwise comparisons of kinase score between the five subtypes were significant (P < 0.05; t test; data not shown). ERBB2-overexpressing and unassigned samples were equally distributed with respect to their kinase score. The luminal Ab tumors displayed a median kinase score, intermediate between that of luminal B tumors, to which the score was closer, and that of luminal Aa tumors.
|
A continuum in luminal breast cancers. The luminal Ab tumors displayed an intermediate kinase score pattern between luminal Aa tumors and luminal B tumors (Fig. 3B). Comparison of histoclinical features between luminal Aa, luminal Ab, and luminal B samples in the three public data sets confirmed this finding (Supplementary Table S6), with a significant increase from luminal Aa to luminal Ab to luminal B for pathologic tumor size and rate of relapse, and a significant decrease for grade, mRNA expression level of ESR1 and PGR, and 5-year RFS. These results confirm that luminal Aa and Ab represent new clinically relevant subgroups of breast cancers until now unrecognized and suggest a continuum between these three subgroups.
| Discussion |
|---|
|
|
|---|
The breast cancer kinome differs between luminal A and basal subtypes. As an exploratory step, we applied hierarchical clustering to 435 kinase genes. We found that luminal A and basal tumors had different global kinome expression patterns, with some degree of transcriptional heterogeneity within luminal A tumors. This observation suggests differential expression of many kinases and, consequently, different phosphorylation programs between the two subtypes. This result is not unexpected because kinases are involved in numerous pathways and many genes are differentially expressed between luminal A and basal breast cancers, which display numerous differentially activated functions.11 Global clustering revealed broad coherent kinase clusters corresponding to cell processes (proliferation and differentiation) or to cell type (immune response), with overxepression of the proliferation cluster in basal samples and of the differentiation cluster in luminal A samples.
Mitotic kinases identify two subgroups of luminal A breast cancers. We identified a set of 16 genes sufficient to distinguish basal from luminal A tumors. Interestingly, a kinase score based on their expression distinguished two subgroups of luminal A tumors (Aa and Ab) with different survival. Identified in our tumor series, this classification and its prognostic effect were validated in 276 luminal A cases from three independent series profiled across different microarray platforms. Importantly, the kinase score outperformed the current prognostic factors in univariate and multivariate analyses in both training and validation sets.
Analysis of molecular function and biological processes revealed that the prognostic value of this kinase signature is mainly related to proliferation. Indeed, the 16 genes encode kinases involved in G2 and M phases of the cell cycle. Aurora-A and Aurora-B are two major kinases regulating mitosis and cytokinesis, respectively. Budding inhibited by benzimidazole (BUB1), BUB1B, checkpoint kinase 1 (CHEK1), polo-like kinase (PLK)1, never in mitosis kinase 2 (NEK2), and TTK/MPS1 play key roles in the various cell division checkpoints. PLK4 is involved in centriole duplication. CDC2/CDK1 is a major component of the cell cycle machinery in association with mitotic cyclins. CDC7, maternal embryonic leucine zipper kinase (MELK), and vaccinia-related kinase 1 (VRK1) are regulators of the S-G2 and G2-M transitions. SRPK1 regulates splicing. Not much is known about microtubule-associated serine/threonine kinase–like (MASTL) and PBK kinases.
Prognostic gene expression signatures related to grade (18, 19) or proliferation (20) have been reported. We found respectively 8 and 10 of our 16 kinase genes in the lists of genes differentially expressed in grade I versus grade III breast cancers reported by Sotiriou et al. (97 genes; ref 18) and Ivshina et al. (264 genes; ref. 19). Three kinase genes, AURKA, AURKB, and BUB1, are included in a prognostic set of 50 cell cycle–related genes (20), and AURKB is one of the five proliferation genes included in the Recurrence Score defined by Paik et al. (21). Furthermore, proliferation seems to be the most prominent predictor of outcome in many other published prognostic gene expression signatures (22). This link of our signature with proliferation also explains the correlation of our luminal A subgrouping with histologic grade, which is in part based on a mitotic index. But interestingly, comparison with Ki67 and grade showed that our mitotic kinase signature performed better in identifying these tumors and predicting the survival of patients.
Mitotic kinases as therapeutic targets. Targeting cell proliferation is a main objective of anticancer therapeutic strategies. Kinases have proven to be successful targets for therapies. Mitotic kinases have stimulated intense work focused on identifying novel antimitotic drugs. Some of them included in our signature represent targets under investigation (23). For example, targeting of Aurora kinases is a promising way of treating tumors (24). Clinical trials of four Aurora kinase inhibitors are ongoing in the United States and Europe: MK0457 and PHA-739358 inhibit Aurora-A and Aurora-B, MLN8054 selectively inhibits Aurora-A, and AZD1152 selectively inhibits Aurora-B. Similarly, small-molecule inhibitors of PLK1, such as ON01910 and BI2536, are being tested (25), as well as flavopiridol (inhibitor of the cyclin-dependant kinase CDC2) and UCN-01 (inhibitor of CHEK1). Other less studied but potential therapeutic targets include TTK, BUB, and NEK proteins (26).
A new relevant subgroup of luminal A breast cancers. Despite their relatively good prognosis compared with luminal B tumors, luminal A tumors display a heterogeneous clinical outcome after treatment, which generally includes hormone therapy. It is important to define the cases that may evolve unfavorably, all the more so that different types of hormone therapy, chemotherapy, and targeted molecular therapy are available. Our poor prognosis subgroup of luminal A tumors (Ab cases) is characterized by high mitotic activity compared with other luminal A tumors (Aa cases). Any error in the key steps in division regulated by these kinases — centrosome duplication, spindle checkpoint, microtubule-kinetochore attachment, chromosome condensation and segregation, and cytokinesis — may lead to aneuploïdy and progressive chromosomal instability. This may in part explain the high grade and poor prognosis of these tumors.
In fact, the luminal Ab subgroup displayed clinical characteristics and a kinase score intermediate between the luminal Aa subgroup and the luminal B subtype. These subgroups were not previously recognized by the Sorlie's intrinsic gene set. We interpret this finding as follows. The use of intrinsic set distinguishes a large proportion of luminal B cancers but is unable to pick all proliferative cases. A small proportion of cases is left to cluster with the luminal A cases and are, therefore, labeled luminal A. An explanation for the poor efficacy of Sorlie's set to define all proliferative luminal cases may be the low number of genes involved in proliferation, including a very low number of kinases. Our mitotic kinase signature makes possible to identify all proliferative luminal cases and reveals a continuum of luminal cases from the more proliferative (luminal B) to the less proliferative (luminal Aa). Reciprocally, there may be a gradient of luminal differentiation giving a continuum of luminal breast cancers, including, from poorly differentiated to highly differentiated, luminal B, Ab, and Aa (Fig. 3B). Optimal response to hormone therapy would be obtained with luminal Aa breast cancers, whereas luminal B and Ab would benefit from chemotherapy and/or new drugs targeting the cell cycle and various kinases as discussed above.
| Acknowledgments |
|---|
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
| Footnotes |
|---|
9 Collected from http://microarrays-pub.stanford.edu/wound_NKI/ ![]()
10 http://www.cran.r-project.org ![]()
Received 9/18/07. Revised 11/ 9/07. Accepted 11/29/07.
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
L. E. Kelemen, X. Wang, Z. S. Fredericksen, V. S. Pankratz, P. D.P. Pharoah, S. Ahmed, A. M. Dunning, D. F. Easton, R. A. Vierkant, J. R. Cerhan, et al. Genetic Variation in the Chromosome 17q23 Amplicon and Breast Cancer Risk Cancer Epidemiol. Biomarkers Prev., June 1, 2009; 18(6): 1864 - 1868. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| Cancer Research | Clinical Cancer Research |
| Cancer Epidemiology Biomarkers & Prevention | Molecular Cancer Therapeutics |
| Molecular Cancer Research | Cancer Prevention Research |
| Cancer Prevention Journals Portal | Cancer Reviews Online |
| Annual Meeting Education Book | Meeting Abstracts Online |