| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
Molecular Biology, Pathobiology, and Genetics |
1 Translational Medicine and Genetics, GlaxoSmithKline and 2 Abramson Cancer Center, University of Pennsylvania, Philadelphia, Pennsylvania
Requests for reprints: Tal Z. Zaks, GlaxoSmithKline, 1250 South Collegeville Road, UP 4430, Collegeville, PA 19426. Phone: 610-917-5124; Fax: 610-917-4830; E-mail: tal.z.zaks{at}gsk.com.
| Abstract |
|---|
|
|
|---|
| Introduction |
|---|
|
|
|---|
Since they were first established, cell lines have been used to assay for drug sensitivity. Monolayers are generally poor at reflecting the in vivo sensitivity of the parent histology to classic chemotherapeutics (1), although a better correlation can be found with three-dimensional culture methods (2). In more recent years, novel drugs have been developed against specific oncogenic aberrations that are vital to the transformed phenotype. These include trastuzumab and lapatinib, which both target the ERBB2 subchromosomal amplification in breast cancer; imatinib, which targets BCR-ABL translocation in chronic myelogenous leukemia as well as activating mutations in KIT; and gefitinib, which is preferentially active against tumors harboring mutant EGFR. In each of these cases, in vitro drug sensitivity closely parallels the in vivo situation. In other words, this is dependent on the presence or absence of the relevant genetic aberration. Moreover, genetically determined in vivo secondary drug resistance arising from additional mutations [e.g., secondary KIT mutations associated with acquired imatinib resistance (3) or EGFR mutations with acquired gefitinib resistance (4, 5)] can be accurately modeled in cell lines. Thus, in these cases, cell lines that share the relevant genotype of the parent tumor accurately model the drug sensitivity phenotype.
With the advent of genome-wide technologies to assay nucleic acids and proteins, a more complete molecular phenotype of cancer has been emerging. Transcriptional profiling, for example, has led to a new understanding of distinct classes of breast cancer (as well as other histologies) that have significant prognostic and therapeutic implications (reviewed in ref. 6). While there have been some studies showing a transcriptional profiling similarity between cell lines and their histology of origin (most notably using the well-characterized NCI-60 panel; ref. 7), the overall concordance is not particularly high and these cell lines seem to lose the tissue-specific up-regulation of genes (8). This difference is not surprising given that transcriptional regulation of many genes is an immediate function of the cellular environment, which is markedly different in vitro than in vivo.
The study of subchromosomal copy number alterations, as in the case of ERBB2 amplification, has revealed mechanisms for tumorigenesis and progression. Array-based comparative genomic hybridization (aCGH) technology in the study of cancers has greatly expanded in recent years, furthered by assays with widespread availability (9, 10) and increasing resolution (11). aCGH-based studies of cancers have primarily used survey style approaches in which tissues recognized as being members of a histologically homogeneous panel are interrogated for recurring copy number alterations. However, a comprehensive knowledge of copy number alterations within and across histologies is far from complete. Although cell lines are commonly used in large-scale aCGH studies, they are normally separated from fresh-frozen tissue panels in subsequent analyses due to their suspected discordance with tumor population (12). This is likely a result of a number of factors. Most notably, the cell line immortalization process has been implicated as a source of cytogenetic changes (13, 14). In addition, multiple growth passages, to which commercially available cell lines are routinely subjected, have been shown to be associated with random genomic instability (15). Finally, past studies have noted differences in gene expression patterns between cell lines and their fresh-frozen tissue counterparts (1618). Despite these observations, more recent analyses of genetic aberrations from larger panels of cell lines and tumors indicate a close concordance of genetic changes within individual histologies (e.g., breast cancer; ref. 19).
We undertook the current analysis to evaluate the degree to which cell lines display an accurate genome-wide model of the DNA copy number aberrations found in human cancers. Specifically we asked: (a) Do common aberrations occur in the same genomic regions in cell lines as their primary tumor counterparts? (b) Are sporadic aberrations resulting from cell line immortalization and growth passages recurrent and predictable, which would facilitate the ability to use cell line panels to model in vivo tumors? To answer these questions, we compiled and compared 19 aCGH data sets from 7 different cancer types with the purpose of estimating the similarity between cell linetumor groups at the data set level. Specifically, this required focusing on trends in genome-wide gain and loss frequencies between data sets stratified by histology and DNA origin (i.e., tumor or cell line). We report that, when taken as a group, (a) copy number aberrations in cell lines from a given histology reflect their cell of origin and (b) specific genomic regions are prone to more frequent gain or loss in cell lines as compared with those seen in vivo. Finally, we begin to identify the patterns of common alterations observed in both tumor and cell line populations that can serve as cancer type delineators.
| Materials and Methods |
|---|
|
|
|---|
1-Mb intervals across the entire human genome. Clones were mapped to the human genome build 34 (June 2003) using BAC end sequences (69%), a sequence tag site (28%), or a full clone sequence (3%). Array details and hybridization protocols are described in detail in ref. 10. For quality control purposes, low intensity and variable spots were removed from the set before averaging the Cy3/Cy5 ratios for all replicates. aCGH data were obtained from public data resources for fresh tumors of five cancer types (n = 445) as well as cancer cell line data for three cancer types (n = 112). For analyses, all samples were organized into separate aCGH data sets that were stratified by (a) cancer type, (b) fresh tumor versus cell linederived DNA, and (c) data source (University of Pennsylvania generated data or public data). The resulting final data set consisted of 872 distinct samples from 19 separate data sets representing 7 different cancer types (Table 1). Notably, cell lines in common between two separate lung cancer data sets (20, 21) were removed in one set such that each cell line was represented in just one set.
|
For genome-wide copy number alteration frequency analysis, segmentation output was assigned to 1-Mb bins across the entire genome, where the log 2based metric assigned by the segmentation algorithm represented the relative copy number status for each bin in each sample. All raw and processed data are available in MIAME compliant format (23).3 This procedure attained a common data format for all data sets with no probe dependence and allowed direct comparisons between data that were drawn from different assays with different probes. Categorization of low-level copy number gains (
5 copies) and heterozygous losses for each bin was made by applying thresholds of >0.25 and <0.25, respectively. For published studies, it was confirmed that our calculated genome-wide aberration frequencies reflected those measured by the study from which the data were obtained. Similarly, high-level amplifications (>5 copies) and homozygous deletions were identified with threshold scores of 0.81 and 1.0, respectively. Due to sex mismatching between test and reference in some samples, chromosomes X and Y were excluded from all analyses.
Data analysis. Estimations of the fraction of the genome experiencing either DNA copy number gain or loss in each sample were made by compiling the total number of segments classified as gained or lost in each sample based on the threshold scores. Aberration rates across each data set were collated by determining the frequency of change for each 1-Mb bin. The gain and loss frequencies assigned to each bin were used to measure similarities in aberration trends between data sets. Specifically, using Pearson distance as a metric, unsupervised hierarchical clustering was done where distance scores were calculated based on genome-wide aberration trends of each data set. Tumor- and cell linespecific aberrations were calculated by subtracting the mean cell line aberration rate of each bin from that of the tumors of a given histology. As a means of determining the specific alterations driving the similarities between data sets, all bins estimated to have a gain or loss frequency of >25% in any data set were mapped to a cytogenetic band and subjected to further clustering analysis. To analyze known cancer-related alterations, a set of 323 cancer genes (24) were mapped and assigned a copy number status based on the previously mentioned log 2 thresholds.
| Results |
|---|
|
|
|---|
|
|
= 14.4% and µ = 9.1%;
= 11.6%, respectively) whereas sarcoma and ovary showed the smallest difference between cell lines and tumors across the entire genome (µ = 0.6%;
= 7.5% and µ = 2.1%;
= 8.5%, respectively). Specific regions that differed between tumors and cell lines were identified in each cancer type by querying for those with a mean difference in frequency that was 2 SD larger than the mean difference for that cancer type. This set of regions reflect those that are the most different between tumor and cell line data sets of matched histology. Several DNA copy number alterations seem to consistently occur at disproportionately higher frequencies in cell lines in at least three cancer groups (Supplementary Table S3). These include gains of large genomic regions such as 20q12-13.33 (
24 Mb) and 17q23.2-24.3 (
11 Mb), as well as more localized gains such as 5q35.1-35.3 (
5 Mb) and 11q13.2-13.4 (
6 Mb). Additionally, more frequent losses of 18q12.2-23 and 9p23-21.3 are seen in cell lines than in tumors (Supplementary Table S3).
Genomic rearrangements, including high level DNA amplifications and homozygous deletions of known disease-related genes, are often defining features in cancer tissues. Consistent with genome-wide data, the frequency of copy number gains of cancer genes (n = 323) is more prevalent in cell lines than tumors (P < 0.0001, t test). Similarly, cell lines exhibit more frequent high level amplifications (log 2 ratio, >0.81) of cancer-related genes than tumors (P < 0.0001, paired t test). Total homozygous losses follow a similar trend by having higher overall rates of occurrence in cell lines (P = 0.0139, paired t test). The disproportionate increased amplification and homozygous deletion occurrence of several genes seem to be consistent with divergent frequencies of gain and loss between tumors and cell lines. For example, breast cell lines show a 27% (6 of 22) amplification rate of the SS18L1 locus (20q13.33), where only
1% (1 of 90) of tumors are amplified at this locus (Table 2
). This gene falls in a region that shows significantly higher overall gain frequencies in breast, melanoma, colon, and lung cell lines than their respective tumors. Similarly, the frequency of homozygous loss (log 2 ratio, <1.0) of CDKN2A (9p21.3) in cell lines seems to be higher in lung and melanoma cell lines than in tumors [7 of 40 (17.5%) cell lines versus 3 of 51 (5.9%) tumors and 5 of 42 (11.9%) cell lines versus 1 of 145 (0.8%) tumors, respectively]. These two histologies are two of four (breast and sarcoma are the others) that showed overall higher loss frequencies of this locus in cell lines. Interestingly, pancreatic samples also had a recurring loss of CDKN2A in cell lines whereas there were no occurrences of this aberration in tumors [3 of 24 cell lines (12.5%) versus 0 of 13 tumors (0%)]. Pancreatic tumors and cell lines showed no difference in overall loss frequency at this locus. Several other cancer-related genes also showed differences in amplification frequencies between data groups. Most notably, higher rates of cell line amplification of the MYC locus (8q24.21) appear in breast, ovary, lung, and colon cancers than in their respective tumor sets. A full list of cancer-related gene DNA copy number gains and losses can be seen in Supplementary Tables S4 and S5, respectively.
|
| Discussion |
|---|
|
|
|---|
The general concordance of results from independent surveys of recurring copy number aberrations in cancers shows that traditional histology-based groupings are proper first tier stratifiers. For example, two studies of colon cancers yield very similar trends in genome aberration frequencies despite their distinct panel of tissues and array platform (25, 26). Similarly, two separate surveys of nonsmall-cell lung cancer cell lines provided independent validation of important, previously described cytogenetic changes (20, 21).
Recent studies have used DNA copy number alterations for the molecular pathology of cancers. For example, common copy number gains of 1q as distinct to breast cancers have been described from a pool of tumor types, whereas 13q gains are largely unique to colon cancers (27). Similar results were observed here, as these alterations were components driving tumor/cell line relationships (Fig. 2). Although few alterations were common (>25%) in only a single cancer type, gains of 8q24.21 (encompassing the MYC locus), gains of 20q13.31, and losses of 18q21.1 seem to occur frequently in this panel of cancers. Conversely, whereas gains of 14q13.1 are relatively common in lung and pancreatic cancers, they are largely absent in the others. Jointly, these and previous results (27) suggest that each histology broadly bears a range of unique DNA aberrations. Copy number alterations have also been used to discern several cancer subtypes. For example, more frequent gains of 11q and 17q have been seen in acral and mucosal melanomas compared with those originating from the skin (28). We would expect that cell line models should similarly represent genotypic subtypes as they do overall histology. For example, it has been shown that colon tumors exhibiting microsatellite instability (MSI+) harbor a distinct set of alterations than microsatellite stable tumors (MSI; ref. 26). Concordantly, when all colon cancers are stratified by MSI status and tumor and cell line origin, those showing the two MSI+ sample sets segregate from MSI cancers when subjected to the genome-wide clustering described previously (Supplementary Fig. S1).
Collectively, these results help quantify cancer cell lines as accurate, reflective models for investigating in vitro genomic alterations in human cancers. Cancer cell lines exist as appealing models for studying DNA copy number and, by extension, therapeutic response prediction for several reasons. First, cancer cell lines provide a more homogeneous cell population where cell-to-cell variation in copy number is thought to be reduced. Tumor heterogeneity in primary lesions can limit the ability to accurately describe copy number alterations due to the infusion of normal cells, causing a diluted signal (20), which is consistent with the observation that alterations occur at higher frequencies in cell lines than tumors. Most importantly, tumors cannot be analyzed for copy number alterations in vivo, whereas the use of cell lines for aCGH analysis opens the possibility of time course analysis (29) and drug treatments (30, 31). In parental cell lines and their derivatives, drug response can be engineered and studied in relationship to basal and evolved genomic changes (e.g., ref. 32). Our analyses substantiate the translation of these observations to primary tissues by suggesting that relatively large-scale copy number genetic aberrations seen in cell lines in vitro accurately reflect their parent histology. Specifically, these results support the notion that cell lines can serve as relevant in vitro models for developing specific therapies and imply that, by understanding the exact genetic determinants of a phenotype in a cell line, it is possible to accurately target similar genotypes in a patient population. In the case of well-known genetic aberrations, using cell lines as in vitro models for biomarker discovery and validation is already routinely done (i.e., modeling the effectiveness of kinase inhibitors on cells that have the relevant target amplification or deletion). More generally, the use of cell lines with defined genetic aberrations allows inferences to be made not only on which histologies but also the specific patients that may respond to a given therapy. This is an observation that could be deduced from previous meta-analysis of gene expression data (33) but has not yet been observed as a trend in microarray-based DNA copy number data. In addition, these analyses validate the nature of published CGH study designs, where the calculation of the most common alterations yields biomarkers that are most likely relevant to the oncogenic phenotype (34). By comparing data across a wide spectrum of cell line histologies, loci that are more commonly aberrant in cell lines than in tumors (perhaps as a result of artificial selection pressure) can be identified. Cell lines harboring these recurrent aberrations may be less faithful models of the primary tumor. Further, understanding which recurrent loci are more associated with cell lines will allow them to be accounted for. For example, models aimed at predicting phenotypes that use one or more of these discordant loci may be suspect.
There are several apparent limitations of these analyses. Most significantly, cell line panels may not provide true representation of the range of phenotypes of the parent histology. For example, DNA amplification of the N-Myc locus in human neuroblastomas, a biomarker for a malignant phenotype, occurs at significantly higher rates in cell lines than tumors (35). The overrepresentation of such genetic changes may reflect a bias in those tumors selected for or those capable of undergoing transformation. Furthermore, although all data sets in this study are derived from sub-megabase resolution microarrays, platform variation can confound analyses. Probe density (i.e., genome-wide spacing), noise levels, and mechanical factors (e.g., normalization process) can vary between platforms and laboratories. Each of these variables is capable of affecting the accurate description of copy number aberrations. Finally, CGH analyses reflect an average of genomic aberrations that have occurred in populations of cells. Due to the confounding dilution effect caused noncancerous cell populations, all primary tissues in this study (University of Pennsylvania and published data) were subject to macrodissection under light microscopy to maximize the percent tumor. Resulting tumor cell proportions were variable for these studies, ranging from a minimum of >70% (21) to 50% (36). Although it has been shown that aCGH assays can tolerate up to 50% infusion of normal cells (37), less pure tumors are likely to be prone to increased rates of false negatives. Further, heterogeneous clonal populations may be more common in tumors than in cancer cell lines and can lead to an uninformative profile when assayed by CGH. Ultimately, this possibility can confound the translation from cell line model to tumor genome and provide insight into why aberrations appear in uniformly lower frequencies in tumor data sets. In the future, this may be remedied by techniques that effectively evaluate zthe genome of single cells (38). Several data sets showed discordance between tumor and cell line populations, including lung tumors and sarcomas. Whereas tumor heterogeneity and normal cell infusions can account for nonuniform frequencies of alterations, the pattern differences noted in both lung and sarcoma tumors from their cell line models could be the result of the diversity of pathologies within these respective populations. For example, sarcomas represent a host distinct molecular subtypes (reviewed in ref. 39) that are likely to be subject to a unique set of DNA copy number alterations. Tumor and cell line populations were represented disproportionately in several prominent sarcoma subtypes, such as leiomyosarcomas (5% of cell lines, 19% of tumors) or teratomas (0% of cell lines, 15% of tumors). Finally, few studies have been devoted to documenting specific changes to DNA copy number profiles associated with the cell line transformation process. These meta-analyses focus on trends across panels of cancers. Although they do not address directly whether individual cell lines maintain the copy number profile of the parental tumor, they do rediscover findings of previous studies. For example, meta-analysis has suggested disproportionately high rates of loss of 9p21.3, a specific acquired alteration that is described in the immortalization of normal epithelium (40). Similarly, the 20q instability associated with cell lines was also noted in several cancers (14, 41). This suggests that querying large sample sets is also a viable means of discriminating cell linespecific copy number instability.
Although genome-wide RNA profiles of cancer cell lines have often been used in recent years to model drug response and other characteristics in vitro (42), it may be advantageous to conduct these investigations with DNA-based profiles or with combined DNA/RNA profiles. First, although careful in vitro modeling of oncogenic activation can dissect RNA profiles that correlate with drug response and may correlate with in vivo tumor profiles (43), the relationship between cell lines used for in vitro studies and their fresh-frozen tissue counterparts is often questionable (1618). In addition, it is unknown whether successful transcriptional meta-profiles (33) could be independently calculated by exclusively using cell lines. Moreover, the above-mentioned caveats of analyzing potentially heterogeneous cell populations still apply and may be further confounded by the obvious effect of in vitro culture conditions on the transcriptome. When a specific genetic aberration is known, its relevance to modeling drug activity is much more obvious [e.g., BRAF mutations and MEK inhibition (44), MET amplifications and response to MET inhibitors (45), and the above-mentioned studies of KIT and ERB inhibitors]. Moreover, the measurement of discrete genetic aberrations is easier translated into clinical usefulness.
In summary, we have shown that aCGH-based surveys of tumor cell lines preferentially cluster with their parent histology when considering data set with wide trends in copy number aberrations. This finding supports their usefulness as faithful in vitro models of in vivo tumors across a wide range of solid tumors. Further, this observation enables an analysis of the discrete aberrations that may distinguish specific tumor types. Future work will be needed to elucidate these loci to help understand the histologic specificity of underlying oncogenic aberrations as well as to determine how they interact with known aberrations (i.e., oncogenic mutations).
| Acknowledgments |
|---|
| Footnotes |
|---|
3 Available at: http://acgh.afcri.upenn.edu/tvcl. ![]()
Received 10/ 3/06. Revised 1/31/07. Accepted 2/13/07.
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
N. Yang, S. Kaur, S. Volinia, J. Greshock, H. Lassus, K. Hasegawa, S. Liang, A. Leminen, S. Deng, L. Smith, et al. MicroRNA Microarray Identifies Let-7i as a Novel Biomarker and Therapeutic Target in Human Epithelial Ovarian Cancer Cancer Res., December 15, 2008; 68(24): 10307 - 10314. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. S.M. Smalley, M. Lioni, M. D. Palma, M. Xiao, B. Desai, S. Egyhazi, J. Hansson, H. Wu, A. J. King, P. Van Belle, et al. Increased cyclin D1 expression can mediate BRAF inhibitor resistance in BRAF V600E-mutated melanomas Mol. Cancer Ther., September 1, 2008; 7(9): 2876 - 2883. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Greshock, J. Cheng, D. Rusnak, A. M. Martin, R. Wooster, T. Gilmer, K. Lee, B. L. Weber, and T. Zaks Genome-wide DNA copy number predictors of lapatinib sensitivity in tumor-derived cell lines Mol. Cancer Ther., April 1, 2008; 7(4): 935 - 943. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| Cancer Research | Clinical Cancer Research |
| Cancer Epidemiology Biomarkers & Prevention | Molecular Cancer Therapeutics |
| Molecular Cancer Research | Cancer Prevention Research |
| Cancer Prevention Journals Portal | Cancer Reviews Online |
| Annual Meeting Education Book | Meeting Abstracts Online |