Abstract
Activation of the EGFR, KRAS, and ALK oncogenes defines 3 different pathways of molecular pathogenesis in lung adenocarcinoma. However, many tumors lack activation of any pathway (triple-negative lung adenocarcinomas) posing a challenge for prognosis and treatment. Here, we report an extensive genome-wide expression profiling of 226 primary human stage I–II lung adenocarcinomas that elucidates molecular characteristics of tumors that harbor ALK mutations or that lack EGFR, KRAS, and ALK mutations, that is, triple-negative adenocarcinomas. One hundred and seventy-four genes were selected as being upregulated specifically in 79 lung adenocarcinomas without EGFR and KRAS mutations. Unsupervised clustering using a 174-gene signature, including ALK itself, classified these 2 groups of tumors into ALK-positive cases and 2 distinct groups of triple-negative cases (groups A and B). Notably, group A triple-negative cases had a worse prognosis for relapse and death, compared with cases with EGFR, KRAS, or ALK mutations or group B triple-negative cases. In ALK-positive tumors, 30 genes, including ALK and GRIN2A, were commonly overexpressed, whereas in group A triple-negative cases, 9 genes were commonly overexpressed, including a candidate diagnostic/therapeutic target DEPDC1, that were determined to be critical for predicting a worse prognosis. Our findings are important because they provide a molecular basis of ALK-positive lung adenocarcinomas and triple-negative lung adenocarcinomas and further stratify more or less aggressive subgroups of triple-negative lung ADC, possibly helping identify patients who may gain the most benefit from adjuvant chemotherapy after surgical resection. Cancer Res; 72(1); 100–11. ©2011 AACR.
Introduction
Lung cancer is the leading cause of cancer death worldwide (1, 2). Adenocarcinoma, which accounts for more than 50% of non-small-cell lung cancers (NSCLC), is the most frequent type and is increasing. Lung adenocarcinoma has a heterogeneous nature in various aspects, including clinicopathologic features (3). Recent molecular studies have revealed at least 3 major molecular pathways for the development of lung adenocarcinoma (4–8). A considerable fraction (30%–60%) of lung adenocarcinomas develops through acquisition of mutations either in the EGFR, KRAS, or ALK genes in a mutually exclusive manner, and the remaining lung adenocarcinomas, that is, those without EGFR, KRAS, and ALK mutations (herein designated “triple-negative adenocarcinomas”), develop with mutations of several other genes. HER2, BRAF, etc. are known to be mutated also mutually exclusively with the EGFR, KRAS, and ALK genes; however, frequencies of their mutations are very low (<5%; refs. 4–7). Therefore, genes responsible for the development of triple-negative adenocarcinomas are largely unknown.
Mutations in the EGFR gene are prevalent in females and never-smokers, and the frequencies are considerably higher in Asians (40%–60%) than in Europeans/Americans (∼10%; refs. 5–7, 9). EGFR mutations make tumor cells dependent on epidermal growth factor receptor (EGFR) signaling and define patients who respond to EGFR tyrosine kinase inhibitors (TKI), such as gefitinib (10, 11). On the other hand, mutations in the KRAS gene occur predominantly in males and ever-smokers, and their frequencies are higher in Europeans/Americans (>15%) than in Asians (10%; ref. 9). Specific inhibitors against KRAS activity are being developed (12). Therefore, clinicopathologic features of lung adenocarcinomas with EGFR mutations (herein designated “EGFR-positive adenocarcinomas”) and those with KRAS mutations (herein designated “KRAS-positive adenocarcinomas”) are considerably different from each other. Recently, a small subset of EGFR- and KRAS-negative lung adenocarcinomas (∼5%) was shown to have rearrangements of the ALK gene generating gene fusion transcripts (13), and patients with ALK rearrangements tend to be younger and have little or no smoking histories (4, 6–8). Because lung adenocarcinoma cells with ALK rearrangements (herein designated “ALK-positive adenocarcinomas”) are specifically sensitive to ALK TKIs, ALK-positive adenocarcinomas have been recently considered to be another subset of adenocarcinomas by considering the differences in therapeutic targets (4, 6–8). In contrast, clinicopathologic features of triple-negative lung adenocarcinomas have not been precisely characterized because of the lack of sufficient genetic information in these adenocarcinomas.
There have been several studies which attempted to characterize gene expression profiles in particular types of lung adenocarcinoma, including EGFR-positive and KRAS-positive adenocarcinomas (14–17). However, such information is limited for ALK-positive adenocarcinomas and triple-negative adenocarcinomas. Therefore, in this study, we aimed to elucidate clinicopathologic features and gene expression profiles of ALK-positive adenocarcinomas and triple-negative adenocarcinomas in comparison with those of EGFR-positive adenocarcinomas and KRAS-positive adenocarcinomas. We conducted a genome-wide gene expression profiling of 226 lung adenocarcinomas, consisting of 127 EGFR-positive adenocarcinomas, 20 KRAS-positive adenocarcinomas, 11 ALK-positive adenocarcinomas, and 68 triple-negative adenocarcinomas. To identify genes useful for molecular diagnosis and applicable to targeted therapy of ALK-positive adenocarcinomas and triple-negative adenocarcinomas, we focused on genes that were upregulated in these adenocarcinomas by selecting genes with low expression in EGFR-positive and KRAS-positive adenocarcinomas. Several genes were identified as being specifically and significantly upregulated in ALK-positive adenocarcinomas. In particular, the ALK gene itself was highly expressed exclusively in ALK-positive adenocarcinomas. More importantly, a distinct group of triple-negative adenocarcinomas with unfavorable outcome was identified. This group of triple-negative adenocarcinomas showed much worse prognosis than the other group of triple-negative adenocarcinomas, EGFR-positive adenocarcinomas, KRAS-positive adenocarcinomas, and ALK-positive adenocarcinomas. Several genes were identified as being upregulated and critical for predicting prognosis of patients in this group of adenocarcinomas.
Materials and Methods
Patients
The tumors were pathologically classified according to the TNM classification of malignant tumors (18). A total of 226 lung adenocarcinoma cases subjected to expression profiling were selected from 393 stage I–II cases who underwent potential curative resection between 1998 and 2008 at the National Cancer Center Hospital as follows (ref. 19; Supplementary Fig. S1). Among the 393 cases, 363 cases, consisting of 305 stage I and 58 stage II cases, were eligible by the criteria of cases who did not receive any neoadjuvant therapies before surgery and had not been diagnosed with cancer in the 5 years before lung adenocarcinoma diagnosis. All 58 stage II cases were subjected to expression profiling. The 305 stage I cases included 37 cases with relapse and 268 cases without relapse. To improve statistical efficiency, all the 37 relapsed cases and 131 matched unrelapsed cases selected by the incidence density sampling method (20, 21) were subjected to expression profiling. In total, 226 cases, consisting of 168 stage I and 58 stage II cases, were subjected to the expression profiling. Among the 226 cases, 204 who received complete resection (i.e., free resection margins and no involvement of mediastinal lymph nodes examined by mediastinal dissection) and did not receive postoperative chemotherapy and/or radiotherapy, unless relapsed, were subjected to survival analyses. This study was approved by the Institutional Review Boards of the National Cancer Center.
Microarray experiments and data processing
Total RNA was extracted using TRIzol reagent (Invitrogen), purified by an RNeasy kit (Qiagen), and qualified with a model 2100 Bioanalyzer (Agilent). All samples showed RNA Integrity Numbers more than 6.0 and were subjected to microarray experiments. Two micrograms of total RNA were labeled using a 5X MEGAscript T7 Kit (Ambion) and analyzed by Affymetrix U133Plus2.0 arrays. The data were processed by the MAS5 algorithm, and the mean expression level of a total of 54,675 probes was adjusted to 1,000 for each sample. Microarray data are available at National Center for Biotechnology Information Gene Expression Omnibus (GSE31210).
Probe selection for unsupervised clustering
One hundred and seventy-four genes (190 probes), preferentially expressed in ALK-positive and triple-negative adenocarcinomas, were selected by the following criteria; probes whose expression levels were less than 1,000 in any adenocarcinomas with EGFR or KRAS mutations, and probes whose averaged expression levels in ALK-positive and triple-negative adenocarcinomas were more than 1.5-fold higher than those in EGFR-positive and KRAS-positive adenocarcinomas with P values less than 0.05 by t test. Expression levels for these 190 probes were log-transformed and median-centered, both for probes and samples, and were subjected to an unsupervised hierarchical clustering. The clustering was done by the centroid linkage method using the Cluster 3.0 program, and the results were visualized using the Java Treeview program (22).
Mutation analyses
Genomic DNAs from all 226 lung adenocarcinomas were analyzed for EGFR and KRAS mutations by the high-resolution melting method as described (23, 24). Total RNAs from the 226 adenocarcinomas were examined for expression of fusion transcripts between ALK and EML4 or KIF5 using a multiplex reverse transcription PCR (RT-PCR) method (25).
Statistics
Cumulative survival was estimated by the Kaplan–Meier method, and differences in the survivals between 2 groups were analyzed by log-rank test. Influences of variables on relapse-free survival (RFS) and overall survival (OS) were evaluated by uni- and multivariate analyses of the Cox proportional hazard model. For all analyses, smoking status was polarized as never-smokers (0 pack years) and ever-smokers (>0 pack years). Pathologic TNM staging was categorized as stage I versus stage II. For multivariate analysis, all variables were included that were moderately associated (P < 0.1) with RFS or OS in any of the analyses.
Bioinformatics
Associations of gene expression levels with prognosis of NSCLC patients in 7 other expression profile studies were obtained from the PrognoScan database (26). In the PrognoScan database, association of gene expression with survival of patients was evaluated by the minimum P value approach. Briefly, patients were first arranged by expression levels of a given gene. They were then divided into high- and low-expression groups at all possible cutoff points, and the risk differences of any 2 groups were estimated by the log-rank test. Finally, the cutoff point that gave the most pronounced P value was selected.
Results
EGFR/KRAS/ALK mutations and clinicopathologic characteristics of lung adenocarcinomas subjected to gene expression profiling
Among 226 stages I and II lung adenocarcinomas, EGFR and KRAS mutations were mutually exclusively detected in 127 (56%) and 20 (9%) cases, respectively, and an EML4–ALK fusion gene was expressed in 11 (4.9%) cases (Table 1). EGFR or KRAS mutations were not detected in any of the 11 cases with EML4–ALK fusion expression; thus, the occurrence of ALK rearrangements in a mutually exclusive manner with EGFR and KRAS mutations in lung adenocarcinoma was confirmed. The incidence and the fraction of EGFR-, KRAS-, and ALK-positive cases in this study were consistent with those in previous studies (5–7, 9, 13). Accordingly, the remaining 68 (30%) cases were defined as “triple-negative adenocarcinomas” because of the absence of EGFR, KRAS, and ALK mutations. Clinicopathologic features of EGFR-positive adenocarcinomas and KRAS-positive adenocarcinomas in this study are well consistent with those in previous studies of Japanese populations (27, 28). Patients with ALK-positive adenocarcinomas were younger and more likely to be never-smokers, as previously indicated (4, 6–8). Triple-negative adenocarcinomas showed similar clinicopathologic features to those of KRAS-positive adenocarcinomas, that is, a predominance of males, ever-smokers, and advanced stages.
Clinicopathologic characteristics of 226 lung adenocarcinomas subjected to expression profile analysis
Expression profile unique to ALK-positive lung adenocarcinomas
All 226 cases were subjected to genome-wide expression profiling using Affymetrix U133Plus2.0 arrays. One hundred and seventy-four genes, evaluated with 190 probes (Supplementary Table S1), were selected as those preferentially expressed in either ALK-positive adenocarcinomas or triple-negative adenocarcinomas under the criteria described in Materials and Methods. In particular, 10 genes evaluated with 11 probes were markedly upregulated according to the criteria of fold-differences more than 2.0 with P values less than 0.05 (Supplementary Table S2). It was noted that 2 probes for the ALK gene were present among them, and 1 of them (probe ID = 208212_s_at) showed the highest fold-difference of 8.7 between ALK-positive/triple-negative adenocarcinomas and EGFR-positive/KRAS-positive adenocarcinomas among the 190 probes. This result indicated that there is a subset of adenocarcinomas in which ALK was overexpressed. Therefore, an unsupervised hierarchical clustering using these 190 probes was done on 11 ALK-positive adenocarcinomas and 68 triple-negative adenocarcinomas (Supplementary Figs. S1 and S2). There were 3 distinct sets of genes/probes, as indicated by red, yellow, and blue bars on the left of the heat map. Two probes for the ALK gene were present in the gene/probe set with a yellow bar, and 11 cases with extremely high levels of ALK expression comprised a small subcluster in the right side of cluster 1. All the 11 cases corresponded to the ones with EML4–ALK fusion gene expression.
The results strongly indicated that ALK-positive adenocarcinomas have distinct expression profiles in comparison with ALK-negative adenocarcinomas, including not only triple-negative adenocarcinomas but also EGFR-positive and KRAS-positive adenocarcinomas. Therefore, genes with fold-differences more than 2.0 and P values less than 0.05 in their expression between ALK-positive adenocarcinomas and ALK-negative adenocarcinomas were further selected from the 190 probes. Thirty genes with 32 probes were then selected (Table 2). The ALK gene showed the highest level of fold difference in ALK-positive adenocarcinomas. Therefore, as previously reported (29–31), ALK-positive adenocarcinomas express high levels of ALK gene products, supporting that upregulation of the ALK gene is a biological consequence of ALK rearrangements in lung adenocarcinoma cells. Expression profiling further revealed that various other genes are distinctly upregulated in ALK-positive adenocarcinomas. In particular, fold differences of GRIN2A (glutamate receptor, ionotropic, N-methyl d-aspartate 2A) expression were more than 10, as with ALK expression. Moreover, GRIN2A was branched most closely to ALK in the heat map (Supplementary Fig. S2). Therefore, high levels of GRIN2A expression can be a characteristic unique to ALK-positive adenocarcinomas, in addition to upregulation of the ALK gene itself. The levels of GRIN2A expression in ALK-positive adenocarcinomas were significantly higher than those in ALK-negative adenocarcinomas by quantitative RT-PCR analysis (Supplementary Fig. S3).
Genes upregulated in ALK-positive lung adenocarcinomas
Triple-negative lung adenocarcinomas with poor prognosis identified by gene expression profiling
By the unsupervised hierarchical clustering, 68 triple-negative adenocarcinomas were separated into 2 major groups, one containing 36 cases and the other 32 cases, designated as groups A and B, respectively (Fig. 1). Group A comprised cluster 1 with 11 ALK-positive adenocarcinomas. Group A cases were dominant in males, ever-smokers, and advanced stages, whereas group B cases were dominant in never-smokers and early stages (Table 1), indicating that group A cases comprise an aggressive type in triple-negative adenocarcinomas. Therefore, we next compared RFS and OS among the 5 groups of patients; groups A and B, EGFR-positive cases, KRAS-positive cases, and ALK-positive cases (Fig. 2). Among the 226 cases, 204 cases that received complete resection and did not receive postoperative chemotherapy and/or radiotherapy were subjected to survival analysis. Group A cases (n = 32) showed the worst prognosis for both RFS and OS among the 5 groups (Fig. 2A and B). In particular, group A cases showed significantly worse prognosis (P < 0.05) for both RFS and OS than group B cases (n = 30) and EGFR-positive cases (n = 116) by the log-rank test. Such differences were marginally significant between group A cases and KRAS-positive cases (n = 19) and not significant between group A cases and ALK-positive cases (n = 7), probably because the numbers of KRAS-positive and ALK-positive cases were smaller than those of group B and EGFR-positive cases.
Unsupervised hierarchical clustering of 11 ALK-positive adenocarcinomas and 68 triple-negative adenocarcinomas. Triple-negative adenocarcinomas were separated into 36 group A cases and 32 group B cases, and group A cases construct cluster 1 with 11 ALK-positive adenocarcinoma cases. Clinical and genetic features are shown below the tree; sex (black, male; white, female); smoking status (black, ever-smoker; white, never-smoker); pathologic stage (black, stage II; gray, stage IB; white, stage IA); relapse (black, evidence of relapse; white, no evidence of relapse); ALK (yellow, ALK-fusion gene expression positive; white, negative). Three colored bars according to the main branches of probes/genes are shown on the left. Positions of probes for ALK, GRIN2A, and DEPDC1 are shown on the right. ADC, adenocarcinoma.
Kaplan–Meier survival curves for RFS and OS of 204 lung adenocarcinoma cases according to EGFR-positive, KRAS-positive, ALK-positive, group A, and group B. RFS and OS of stage I–II (A, B) and stage I (C, D) cases are shown.
Similar results were obtained from the analysis of 162 patients with stage I adenocarcinomas (Fig. 2C and D), indicating the independency of these associations with staging. Therefore, we next carried out multivariate analyses on RFS and OS of these 5 groups (Table 3). In the analysis of 204 stages I and II patients, RFS and OS of group A cases were significantly worse than those of EGFR-positive and group B cases, and the differences were independent of staging. HRs of ALK-positive and KRAS-positive cases were also as high as EGFR-positive and group B cases, although only the difference in RFS was statistically significant between group A cases and KRAS-positive cases. This could be also due to the small numbers of KRAS-positive and ALK-positive cases. Accordingly, multivariate analyses of 162 stage I patients further showed significant differences in RFS and OS between group A cases and EGFR-positive cases, and also between group A cases and group B cases. Because numbers of KRAS-positive cases and ALK-positive cases were small, we next compared RFS and OS between group A patients and patients in all 4 other groups combined (“Others” in Table 3). Differences in RFS as well as those in OS were highly significant and independent of staging. These results strongly indicated that group A patients comprise a distinct subclass of EGFR/KRAS/ALK-negative lung adenocarcinomas, and the prognoses of group A patients were the worst among the 5 groups of patients.
Hazard ratios for relapse-free and overall survivals in lung adenocarcinomas
Clustering of lung adenocarcinomas with poor prognosis by gene expression profiling
We next carried out unsupervised hierarchical clustering of all the 226 adenocarcinoma cases, including 127 EGFR-positive cases and 20 KRAS-positive cases, to investigate whether expression profiling with a set of 174 genes with 190 probes could extract group A cases as a unique subset among all adenocarcinomas, and whether the profiling could be useful for prognosis prediction of patients with any genotypes of adenocarcinomas in general. As shown in Supplementary Fig. S4, clustering patterns of all the 226 patients were very similar to those of the 79 patients consisting of 11 ALK-positive cases and 68 triple-negative cases. In particular, the 11 ALK-positive cases comprised a small cluster in the right side of Cluster 1 (Cluster 1b), supporting that ALK-positive adenocarcinomas show unique expression profiles among all adenocarcinomas. Group A and group B cases also have a tendency to accumulate in Clusters 1a and Cluster 2, respectively. However, group A cases often comprise clusters with the KRAS-positive cases, whereas group B cases were distributed with the EGFR-positive cases. Therefore, group A and group B triple-negative adenocarcinomas were not exclusive with the EGFR-positive and KRAS-positive adenocarcinomas by expression profiling of these 174 genes. Therefore, expression profiling with a set of the 174 genes was concluded to be useful to distinguish ALK-positive adenocarcinomas among all lung adenocarcinomas.
However, RFS of 119 patients in Cluster 1 was significantly worse than RFS of 85 patients in Cluster 2 (HR = 3.73, P = 0.00016). When Cluster 1 was further divided into 2 subclasses 1a and 1b of the right and left sides, respectively, Cluster 1a containing most of group A patients showed the worst prognosis among the 3 subclasses (Supplementary Fig. S4). Therefore, the expression signature of these 174 genes was indicated to be useful for prognostic prediction of adenocarcinoma patients, in particular of triple-negative adenocarcinoma patients.
Minimum set of genes characterizing triple-negative lung adenocarcinomas with poor prognosis
The above results implied that triple-negative adenocarcinomas can be classified into 2 distinct subgroups by expression profiling and prognoses of these 2 groups are significantly different from each other. Accordingly, expression of several genes among the 174 genes was expected to be independently associated with prognosis of triple-negative adenocarcinoma patients. Therefore, we next selected genes whose expression was associated with prognosis from the 174 genes evaluated by the 190 probes. To evaluate the prognostic value of each probe and to make a comparative study for association of gene expression with prognosis in other cohorts possible, we took a minimum P value approach for grouping the patients for survival analysis because of the following reason. A database named PrognoScan was recently developed by coauthors of this study (26). In the PrognoScan database, minimum P values for the association of gene expression with prognosis of all probes in a platform are available for a number of cohorts that have been published. Therefore, it was possible to validate the present findings using data from various other cohorts by the same criteria. According to the method described previously (26), corrected minimum P values were calculated for each probe to control the error rate for the evaluation of the association with RFS and OS. Expression of 11 genes evaluated with 12 probes (2 probes for the DEPDC1 gene) showed significant associations with both RFS and OS in 62 triple-negative adenocarcinomas and also in 46 stage I triple-negative adenocarcinomas (Table 4). Among the 11 genes, expression of 10 genes was positively correlated with poor prognosis, whereas that of the remaining 1 gene, KIF19, expression was negatively correlated with poor prognosis.
List of genes whose expression is associated with relapse free survival and overall survival of patients with lung adenocarcinoma
We first selected 174 genes as being preferentially expressed in either ALK-positive adenocarcinomas or triple-negative adenocarcinomas by the criteria of “probes whose expression levels in any adenocarcinomas with EGFR or KRAS mutations were lower than the mean expression level of a total of 54,675 probes.” Then, 11 of the 174 genes were further selected as being associated with prognosis of patients with triple-negative adenocarcinomas. Therefore, higher expression of several genes among the 11 genes was predicted to be associated with poorer prognosis, even when all adenocarcinoma cases, including EGFR-positive, KRAS-positive, and ALK-positive adenocarcinomas were analyzed together. Furthermore, triple-negative adenocarcinomas with poor prognosis would be separated into a high-risk group classified with this procedure. For this reason, we next analyzed all 204 adenocarcinoma cases. Among the 11 genes with 12 probes, 9 genes with 10 probes showed significant associations with both RFS and OS in all 204 adenocarcinoma cases and also in 162 stage I adenocarcinoma cases. LOC152225 and KIF19 were excluded because of no significant associations in stage I adenocarcinoma cases. As predicted, higher expression of the 9 genes was correlated with poorer prognosis in the analysis of RFS and OS among 204 stages I and II cases and also among 162 stage I cases.
The result strongly indicated that unsupervised hierarchical clustering using this 10 probe set (9 genes) would separate the patients into high-risk and low-risk groups for prognosis and that all group A triple-negative adenocarcinoma patients with poor prognosis would be classified into the high-risk group (Fig. 3 and Supplementary Table S3). As expected, expression profiling of these 9 genes successfully separated the 204 patients into high-risk and low-risk groups with significantly different RFS (HR = 3.79, 95% CI = 2.19–6.55, P = 1.9E-06) as well as OS (HR = 5.72, 95% CI = 2.53–12.87, P = 2.5E-05). Furthermore, if 62 triple-negative cases only were separated with these 9 genes, HRs for both RFS and OS were much higher than those with separation of all the 204 cases. All the relapsed cases in group A were separated into the high-risk group in the analyses of both cases (all the 204 cases and the 62 triple-negative cases only), supporting that triple-negative adenocarcinomas cases with poor prognosis can be selected as a high-risk group from all the adenocarcinoma cases by expression profiling of these 9 genes (Fig. 3). This profiling further separated 162 stage I cases as well as 46 stage I triple-negative adenocarcinoma cases into high-risk and low-risk groups with significantly different RFS as well as OS (Supplementary Fig. S5 and Supplementary Table S3). Again, HRs for both RFS and OS were much higher in triple-negative adenocarcinoma cases than in all adenocarcinoma cases. Accordingly, high levels of expression in these 9 genes were concluded to be distinct characteristics of triple-negative adenocarcinomas with poor prognosis.
Unsupervised hierarchical clustering based on the expression of a set of 9 genes. All 204 stage I–II adenocarcinomas and 62 triple-negative (TN) stage I–II adenocarcinomas of the National Cancer Center (NCC) data set subjected to survival analysis were analyzed, and a cluster with higher expression of these genes than the other cluster was recognized as a high-risk group (red bar). Results of 117 adenocarcinomas, including 57 double-negative (DN) adenocarcinomas, of the Aichi Cancer Center (ACC) data set are shown below.
Validation of associations using independent expression profiling data
To validate the present findings using the data of other cohorts, we searched for expression profiling data with mutation data of the EGFR, KRAS, and ALK genes in various databases. However, there has been no cohort in which expression profiles specifically in triple-negative adenocarcinomas were analyzed. Therefore, unsupervised hierarchical clustering using these 9 genes was done on a cohort of 117 Japanese lung adenocarcinoma cases because expression profile data as well as EGFR/KRAS mutation data were available only in this cohort (32). This study included 57 adenocarcinoma cases without EGFR and KRAS mutations. Although a different array platform was used, the data for all the 9 genes were available for clustering. These cases were separated into 2 groups of 33 cases and 24 cases (Fig. 3). OS of the 33 cases was significantly shorter than that of the 24 cases (HR = 3.17, 95% CI = 1.17–8.63, P = 2.4E-02; Supplementary Table S3). As with our cohort, the high-risk group showed a significantly higher HR of 2.73, even when all the 117 cases were analyzed together. Although ALK mutation data were not available for this cohort, the results strongly supported that expression profiling of the 9 genes would be highly informative for prediction of prognosis of lung adenocarcinoma patients, in particular patients with EGFR- and KRAS-negative adenocarcinomas.
Associations of DEPDC1 expression with prognosis of NSCLC patients
Associations of gene expression with prognosis in various cancers are available from the PrognoScan database (22). Therefore, associations of expression of these 9 genes with prognosis of NSCLC patients were examined in 7 other cohorts (Table 4). Notably, DEPDC1 expression was positively associated with poor prognosis in 4 of the 7 cohorts; MSK, Nagoya, Duke, and Seoul. The results strongly indicated that DEPDC1 expression can be a novel prognostic marker for patients with NSCLC. Representative data showing the association of DEPDC1 expression with prognosis in 204 adenocarcinoma patients obtained from the minimum P value approach are shown in Supplementary Fig. S6. Associations of DEPDC1 expression with RFS and OS were validated by quantitative RT-PCR analysis of 204 stages I and II cases and also of 162 stage I cases (Supplementary Fig. S3).
FOSL2 expression was associated with prognosis in 3 of the 7 cohorts, whereas MCM4, CD300A, and UBE2S expression was associated in 1 cohort, respectively (Table 4).
Discussion
In this study, we attempted to characterize ALK-positive adenocarcinomas and triple-negative adenocarcinomas by genome-wide expression profiling. For this purpose, we selected a set of genes that are not transcriptionally activated in any EGFR-positive and KRAS-positive adenocarcinomas, and obtained 2 pieces of unique evidence. One is that ALK-positive adenocarcinomas show unique expression profiles in comparison with any other types of adenocarcinomas. The other is that there is a group of patients with extremely poor prognosis among triple-negative adenocarcinomas. This group, herein designated as group A, of patients showed much worse prognoses than patients with EGFR, KRAS, or ALK mutations and also than the other group, group B, of patients with triple-negative adenocarcinomas.
ALK-positive adenocarcinomas are sensitive to ALK TKIs with an overall response rate of 55% (8). Therefore, for the clinical application of ALK-targeted therapy, it is indispensable to develop a simple and reliable method for detection of ALK rearrangements in lung adenocarcinomas. Here, we showed that ALK expression is exclusively high only in ALK-positive adenocarcinomas and that several other genes, including GRIN2A, are overexpressed together with ALK specifically in ALK-positive adenocarcinomas. Therefore, GRIN2A can be a biomarker for detection of ALK-positive adenocarcinomas. GRIN2A encodes an N-methyl-d-aspartate (NMDA) receptor, which is a neurotransmitter-gated ion channel involved in regulation of synaptic function in the central nervous system (33). It was noted that the GRIN2A gene was recently reported to be frequently mutated in melanoma (34). Therefore, although the biological significance of GRIN2A upregulation in ALK-positive adenocarcinomas remains unclear, GRIN2A expression may play some important role in the phenotype unique to ALK-positive adenocarcinomas. Expression profiles unique to ALK-positive adenocarcinomas, shown here, will be also informative to improve clinical detection of ALK rearrangements.
Group A cases were discriminated by expression profiling of 9 genes among stage I–II cases who received complete surgical resection of tumors. Therefore, this gene set will be applicable as biomarkers to select lung adenocarcinoma patients who will benefit from adjuvant therapy after surgery, in particular to select them among patients with triple-negative adenocarcinomas. For this reason, combined analyses of this expression profiling with mutational analyses of the EGFR, KRAS, and ALK genes will be appropriate to pick out triple-negative adenocarcinoma patients with poor prognosis from all the adenocarcinoma patients. Molecular targeting drugs against triple-negative adenocarcinomas are not available at present; therefore, genes upregulated in group A cases will also be applicable as targets for therapy. DEPDC1 was previously identified as being upregulated in bladder cancer and breast cancer (35–37). Because DEPDC1 expression was hardly detectable in any normal tissues except testis, it has been considered as a cancer/testis antigen and also as a promising target of therapeutic drugs (35, 36). This study showed that DEPDC1 is preferentially expressed in triple-negative adenocarcinomas with poor prognosis. In the PrognoScan database, DEPDC1 expression is shown to be positively associated with poor prognosis in bladder cancer, multiple myeloma, breast cancer, glioma, and melanoma. Therefore, DEPDC1 could be a novel target for diagnosis as well as therapy in various cancers, including lung adenocarcinoma.
Identification of genetic alterations that occur specifically in group A cases will be also of great importance for the development of target therapy for stages I and II lung adenocarcinoma patients with poor outcomes. Group A cases include males and ever-smokers as a majority (Table 1); therefore, group A cases were likely to carry several genetic alterations induced by tobacco carcinogens leading to poor outcomes. Identification of genetic alterations in group A adenocarcinomas will further facilitate the development of targeted therapies for lung adenocarcinomas with poor prognosis.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
Grant Support
This work was supported in part by grants-in-aid from the Ministry of Health, Labor and Welfare for the 3rd-term Comprehensive 10-year Strategy for Cancer Control and from the Program for Promotion of Fundamental Studies in Health Sciences of the National Institute of Biomedical Innovation (NIBIO: 10–41). K. Shiraishi was an awardee of a Research Resident Fellowship from the Foundation for Promotion of Cancer Research in Japan.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Acknowledgments
The authors thank Dr. Teruhiko Yoshida and Ms. Sachiyo Mimaki for their efforts in expression profiling.
Footnotes
Note: Supplementary data for this article are available at Cancer Research Online (http://cancerres.aacrjournals.org/).
- Received April 23, 2011.
- Revision received October 19, 2011.
- Accepted October 21, 2011.
- ©2011 American Association for Cancer Research.