Cancer Research AACR Membership  Genetics and Biology of Brain Cancer
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
Cancer Research Clinical Cancer Research
Cancer Epidemiology Biomarkers & Prevention Molecular Cancer Therapeutics
Molecular Cancer Research Cancer Prevention Research
Cancer Prevention Journals Portal Cancer Reviews Online
Annual Meeting Education Book Meeting Abstracts Online

Cancer Research 67, 10173, November 1, 2007. Published Online First October 29, 2007;
doi: 10.1158/0008-5472.CAN-07-2102
© 2007 American Association for Cancer Research

This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Supplementary Data
Right arrow All Versions of this Article:
0008-5472.CAN-07-2102v1
67/21/10173    most recent
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Greshock, J.
Right arrow Articles by Chin, L.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Greshock, J.
Right arrow Articles by Chin, L.

Molecular Biology, Pathobiology, and Genetics

A Comparison of DNA Copy Number Profiling Platforms

Joel Greshock1, Bin Feng2, Cristina Nogueira3, Elena Ivanova2, Ilana Perna2, Katherine Nathanson5, Alexei Protopopov2, Barbara L. Weber1 and Lynda Chin2,3,4

1 Translational Medicine, GlaxoSmithKline, King of Prussia, Pennsylvania; 2 Center for Applied Cancer Science, the Belfer Institute for Innovative Cancer Science and 3 Department of Medical Oncology, Dana-Farber Cancer Institute; 4 Department of Dermatology, Harvard Medical School, Boston, Massachusetts; and 5 Abramson Family Cancer Research Institute, University of Pennsylvania, Philadelphia, Pennsylvania

Requests for reprints: Lynda Chin, Department of Medical Oncology, Dana-Farber Cancer Institute, 44 Binney Street, Boston, MA 02115-6084. Phone: 617-632-6091; Fax: 617-632-6069; E-mail: lynda_chin{at}dfci.harvard.edu.


    Abstract
 Top
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 References
 
The accurate mapping of recurring DNA copy number aberrations (CNAs), a hallmark feature of the cancer genome, has facilitated the discovery of tumor suppressor genes and oncogenes. Microarray-based assays designed to detect these chromosomal copy number alterations on a genome-wide and high-resolution scale have emerged as a cornerstone technology in the genomic era. The diversity of commercially available platforms prompted a systematic comparison of five copy number profiling assays for their ability to detect 2-fold copy number gain and loss (4n or 1n, respectively) as well as focal high-amplitude CNAs. Here, using a collection of established human melanoma cell lines, we defined the reproducibility, absolute signals, signal to noise, and false-positive and false-negative rates for each of the five assays against ground truth defined by spectral karyotyping, in addition to comparing the concordance of CNA detection by two high-resolution Agilent and Affymetrix microarray platforms. Our analyses concluded that the Agilent's 60-mer oligonucleotide microarray with probe design optimized for genomic hybridization offers the highest sensitivity and specificity (area under receiver operator characteristic curve >0.99), whereas Affymetrix's single nucleotide polymorphism microarray seems to offer better detection of CNAs in gene-poor regions. Availability of these comparison results should guide study design decisions and facilitate further computational development. [Cancer Res 2007;67(21):10173–80]


    Introduction
 Top
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 References
 
DNA copy number aberration (CNA) is widely regarded as an important etiology in a range of human diseases (1). Complete and partial nondiploid genomes resulting from cytogenetic alterations have been implicated in the diagnosis of congenital disorders (2) and in prediction of clinical outcomes in many cancer types (3). Furthermore, investigating somatic changes in DNA copy number has proven to be a productive path toward identifying novel tumor suppressor genes and oncogenes. Microarray platforms are of particular utility as they enable the genome-wide detection of CNAs in a high-throughput manner, thus providing useful starting points for agnostic discovery of novel disease genes.

Recent years have witnessed major advancement in copy number profiling technologies beyond the traditional metaphase comparative genomic hybridization (CGH; ref. 4). Early efforts used large PCR-amplified sequences as probes, typically bacterial artificial chromosomes (BAC; ref. 5), or cDNAs (6) and provided a resolution of 1 to 2 Mb. The large probe sequences of these BACs, with stringent preselection, provided robust responses to copy number variations, where nonamplified complex targets exhibited sufficiently low noise levels for consistent detection of single copy aberrations (7). Continued development of BAC clone-based assays has produced arrays with complete sequence coverage of the human genome (8), increasing the effective resolution to ~80 kb. In parallel, a new generation of oligonucleotide-based platforms has taken advantage of synthesized sequences to achieve dense coverage, shedding the dependence on (and thus the limitation of) clone libraries. This diverse generation of array-based copy number assays has inherent flexibility in design and versatility in applications while permitting fabrication of high-density microarrays.

A major difference among these oligonucleotide microarray platforms and the gold-standard BAC-based platforms is probe length. Whereas BAC clones, typically ~150 kb in length, provide a high degree of specificity for the fragmented target sequences, the relatively small sizes of synthesized oligonucleotides offer lower signal to noise ratios for each probe. Optimization in labeling and hybridization protocols coupled with analytic development [e.g., circular binary segmentation (9) that avoids defining alterations associated with a single probe] has shown that sufficient signal to noise could be achieved with 60-mer oligonucleotide probes in full-complexity genomic hybridization (10). Single-channel single nucleotide polymorphism (SNP)-based microarrays designed for genotyping have also been adopted for copy number analysis (11, 12). Such SNP microarrays depend on a separate data set to establish a copy number reference against which to differentiate diploid from aberrant. Of note, genomic hybridization onto these short oligonucleotide probe arrays (typically 25 nucleotide long) typically uses an adaptor ligation PCR step before labeling to reduce target complexity.

Although a fast-evolving technological front, several 60-mer and SNP oligonucleotide microarray assays are now routinely used for copy number profiling. Thus, we reasoned that a systematic assessment of these established platforms with objectively defined variables will generate a well-controlled data set that not only informs investigators in their experimental design but also facilitates development of next generation of improved assays as well as new analytic tools for copy number analyses that address the common and unique computational challenges presented by each platform. To this end, we generated copy number profiles of a defined set of tumor cell lines on five oligonucleotide microarray-based assays of three platforms (Agilent, Affymetrix, and NimbleGen) and determined the reproducibility, signal and noise, as well as sensitivity and specificity of each in detecting 2-fold signals based on spectral karyotyping (SKY)-defined aberrations as ground truth for comparison. In addition, high-density microarray assays from Agilent and Affymetrix platforms were further compared for definition of CNAs in an independent data set using published analytic approaches. All primary and processed profiling data were deposited in the National Center for Biotechnology Information's Gene Expression Omnibus repository for public access, analyses, and tool development (series GSE7822).


    Materials and Methods
 Top
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 References
 
Melanoma cell lines. In phase 1, seven melanoma cell lines (Supplementary Table S2A) were grown from a single frozen stock in RPMI 1640 supplemented with 10% heat-inactivated fetal bovine serum. DNA was extracted from each line using Mini DNeasy kit (Qiagen, Inc.) and subsequently purified. For two-channel arrays, commercial pooled male DNA from Promega, Inc. was used as reference. For single-channel arrays, 12 unique lymphoblastic cell lines composing part of a human genomic diversity panel (obtained from Coriell Institute, Camden, NJ; ref. 13) were assayed and used as a reference panel (see Analysis below). In phase 2, genomic DNAs were extracted from 18 melanoma cell lines (Supplementary Table S2B) as above and used for labeling and hybridization onto Agilent 244K (AG244K) and Affymetrix 500K (AF500K) microarrays. Same reference DNA or panel was used as above.

Platforms. In phase 1, all seven cell lines were profiled in duplicate on five oligonucleotide-based microarray platforms (Table 1 ). These include Agilent 44K (AG44K) and 185K (AG185K) microarrays, NimbleGen 1500K (NG1500K) density microarray, and Affymetrix 100K (AF100K; Centurion) and AF500K (Mendel) chips. AG185K served as a prototype microarray for the currently commercialized AG244K CGH microarray. Both Agilent and NimbleGen platforms use dual-channel competitive hybridization protocol, whereas Affymetrix platforms use single-channel hybridization. CGH profiles on NimbleGen were generated by the manufacturer, whereas the AG185K array CGH profiles were generated by Agilent Laboratories. The AF100K and AF500K profiles were generated by the Children's Hospital of Philadelphia microarray facility and GlaxoSmithKline. Additionally, for the purposes of serving as a reference, these seven cell lines were also assayed on full tiling path Human reArray BAC clone array. Previously published data on these cell lines from a 1-Mb resolution BAC array were downloaded from public resources (14). In phase 2, 18 cell lines were profiled without duplicate on AF500K chip by Expression Analysis, Inc. and on AG244K by the Belfer Cancer Genomics Center at the Dana-Farber Cancer Institute. In all cases, factory-recommended hybridization protocols were followed as closely as possible for each platform.


View this table:
[in this window]
[in a new window]

 
Table 1. Reproducibility measured by correlation between duplicate hybridizations

 
Spectral karyotyping. SKY was done on four melanoma cell lines, WM88, WM1366, WM983C, and Lu1205. SKY was done using the SkyPaint kit for human samples (Applied Spectral Imaging) according to the manufacturer's protocols. Images were acquired using a Nikon Eclipse E6000 microscope equipped with the SD300 SpectraCube and Spectral Imaging acquisition software. Twenty-four metaphase spreads were analyzed per each sample.

Analysis. Probes for every assay were mapped to human genome build 36 (March 2006) using data provided by the University of California at Santa Cruz genome browser site6 or by the vendor. For each probe on every platform, a log2 copy number ratio was measured from raw data derived from the scanned image. For dual-channel arrays, this ratio was calculated by dividing the test channel image intensity by that of the reference channel for every probe. Probe-wise ratios were calculated for the single-channel Affymetrix chips by comparing the "perfect match" intensities with the range of intensities seen in the reference chip set using the dChip software package (15) and methods described in ref. 12. Ratios of duplicate clones were averaged for all assays. Subsequently, every assay was normalized under the assumption that median copy number was diploid such that the median log2 ratio is zero. Details of this process for each platform can be found as part of the supplement.

To quantify the probe-wise signal response of each platform, copy number alterations were identified at four loci in cell line SKY data. This consisted of three distinct subchromosomal gains (4n) ranging from approximately 18 to 103 Mb and one ~100 Mb loss (1n). Additionally, four regions of at least ~45 Mb in size representing diploidy (2n), one in each matching cell line, were identified to serve as a reference for comparison. A signal to noise ratio was calculated as follows.

Formula
where, for melanoma cell line i, µ represents the median log2 signal from 4n/1n aberrant region j, whereas {sigma} represents the log2 SD from diploid region k. The degree to which individual probe measurements can accurately distinguish gains and losses was estimated by recalculating probe ratio scores for these regions with sliding windows of 1, 3, and 7 probes. This procedure also supplied a platform-specific ratio threshold by identifying the ratio with the minimal degree of overlap for every platform as the most appropriate boundary for discriminating gains and losses for that platform in subsequent analyses.

Platform specificity for individual probes was defined simply as the proportion of probes that were diploid (reference region defined by SKY) that would be classified as aberrant given a sliding log2 ratio threshold, whereas sensitivity was the reverse scenario (aberrant classified as diploid). Given the optimal ratio thresholds defined by receiver operator characteristic (ROC) analysis, the false-positive rate (FPR) for a platform was calculated by querying the proportion of diploid probes that would be classified as aberrant. The false-negative rate (FNR) is proportion of aberrant probes classified as 2n.

The area under the curve (AUC) of ROC curve is calculated by evenly dividing ROC curve into 10,000 pieces at the X axis. Each piece is approximated as a rectangle and the AUC is the sum of area of 10,000 rectangles.

Probe set–independent comparisons between platforms were calculated for each assay using standard circular binary segmentation (9) for both phase 1 and 2 analysis. Focal amplifications (gains more than approximately five copies) and homozygous losses were identified by querying regions ~1 Mb or smaller that were assigned a log2 copy number ratio two times the calibrated gain/loss threshold described above.

In phase 2, CNAs are defined as described in previous studies. A "segmented" data set was generated by determining uniform copy number segment boundaries and then replacing raw log2 ratio for each probe by the mean log2 ratio of the segment containing the probe. Segments at 98th percentile and 2nd percentile were used as amplification and deletion threshold, respectively. All 18 samples in AG244K data set are mode centered based on segmented data before generating CNAs. All 18 samples in AF500K SNP data set are also baseline adjusted based on one chromosome for each sample in AG244K data set. Supplementary Table S5 shows that the median log2 ratio of listed chromosomes is the same in both AG244K and AF500K data sets after baseline adjustment.


    Results and Discussion
 Top
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 References
 
The systematic comparison was conducted in two phases. Phase 1 compared five oligonucleotide microarray-based copy number profiling assays from three established platforms (AG44K and AG185K, AF100K and AF500K SNP, and NG1500K; Supplementary Table S1). The rationale for including two versions of Agilent and Affymetrix platforms was to enable comparison and assessment of potential benefit and trade-off of increased density/resolution within a platform. Of note, AG185K microarray was a prototype to its current AG244K CGH array on the market. Whereas both Agilent and NimbleGen platforms use 60-mer oligonucleotide probes, Agilent probes have been computationally preselected and experimentally optimized for genomic hybridization with bias for gene-rich regions. NimbleGen probes are Tm normalized but not optimized for genomic hybridization; instead, they are designed to tile densely across the entire genome. Currently, their highest resolution assay consists of 3.8 million probes on a set of eight microarrays. NimbleGen's intermediate density assay consisting of 1.5 million probes on four microarrays was used for this study.

Seven melanoma cell lines (Supplementary Table S2A) were expanded in vitro and harvested for metaphase spreads and genomic DNA isolation. Cytogenetic profiles of each cell line were generated in house by SKY and copy number profiles on all platforms were generated by commercial vendors or expert core facility (see Materials and Methods) outside of the authors' laboratory to minimize bias due to technical familiarity. To control for effect of DNA quality on data, same genomic DNA preparations were used for all platforms. The resultant data set was analyzed for reproducibility, signal, noise, sensitivity, and specificity of 2-fold copy number alteration detection as well as identification of known focal CNAs.

Reproducibility. The first variable we determined was the reproducibility of replicate hybridizations for each assay. Here, reproducibility was measured by comparing correlation scores between replicates. As shown in Table 1, the Agilent-optimized CGH arrays offered the highest degree of reproducibility in replicate hybridizations, whereas the single-channel Affymetrix SNP arrays were intermediate in this respect. For both Agilent and Affymetrix platforms, the degree of reproducibility was higher for the higher-density assays (AG44K versus AG185K with P = 0.0436, paired t test; AF500K versus AF100K with P = 0.0223, paired t test), likely reflecting more consistent detection of focal aberrations with additional reporting probes as well as design/manufacturing advances (e.g., probe selection). Although it is conventional to do duplicate hybridizations for dual-channel assays (e.g., Agilent and NimbleGen) and single hybridization for one-channel assay (e.g., Affymetrix), to achieve the most parallel comparison possible, all subsequent analyses in this study are based on single hybridization for either dual- or single-channel assays.

Sensitivity and specificity in deletion of regional gain and loss. Two-fold change in copy number translates into detection of 1n (heterozygous loss) or 4n (two copy gain) relative to baseline 2n (diploid) genome. The ability to detect these low-amplitude events depends on absolute signals and signal to noise ratios, thus serving as an ideal test to assess the robustness of a platform. Here, we first determined the ploidy and defined regions of "ground truth" for comparison based on the SKY profiles (16). In particular, four of the seven cell lines were selected, each determined to be predominantly diploid and harboring large contiguous genomic regions (>20 Mb) with 2-fold gain (4n; WM983C, WM88, and Lu1205) or 2-fold loss (1n; WM1366; Supplementary Figs. S1–4; Table 2 ). Within these defined genomic regions, individual probe values from each of the platforms were used for calculation of signals and noises. Absolute signal was calculated as the mean probe values reporting on regions of 4n or 1n, whereas noise was defined as SD of probes reporting on the defined 2n region. On a log2 scale, the theoretical maximum for 2-fold signal is 1.0. As shown on Table 2, the strongest absolute signal achieved was 0.93 by AG185K microarray in WM88 cell line. Among the three cell lines with 4n gain (WM983C, WM88, and Lu1205), signal was poorest for WM983C in all assays regardless of platform, consistent with the fact that this cell line is consisted of two major subpopulations as revealed by SKY (Supplementary Table S3). In other words, heterogeneity within a sample will result in lower signals of observed CNAs, a variable of importance when one considers analyses of primary tumor tissues consisting of both tumor and stromal populations.


View this table:
[in this window]
[in a new window]

 
Table 2. Absolute signals and signal to noise for 2-fold copy number alteration

 
Consistent with absolute signal levels, signal to noise similarly revealed most robust detection by the Agilent microarrays (Table 2). Another way to visualize this signal to noise metrics is by density plot of probe values for the paired regions (2n versus 1n or 4n; Fig. 1A , two cell lines shown). A density plot with clear bimodal separation of probes reporting on region of 4n and 2n reflects an assay with high signal to noise, whereas a density distribution with significant overlap represents low signal to noise (Fig. 1A, compare middle column: AG185K to AF500K). By calculating probe-wise FPR and FNR based on the proportion misclassified at an optimal threshold set at the trough of the density plot, we found that the FPR and FNR were lowest for the AG185K microarray assay (Fig. 1A; data not shown). Lastly, we generated the ROC curves and computed the AUC (see Materials and Methods) to assess the trade-offs between sensitivity and specificity because more stringent thresholds are expected to decrease sensitivity while increasing specificity. Consistent with the other variables above, Agilent's optimized oligonucleotide array showed the best sensitivity and specificity, achieving AUC of >0.99 (Fig. 1B; Supplementary Fig. S5).


Figure 1
View larger version (35K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 1. The sensitivity and specificity of each copy number platform seems related to both probe spacing and length. A, top, given an optimized ratio threshold for identifying aberrant probes, the AG185K classified the fewest probes incorrectly when comparing regions of 4n and 2n in the melanoma cell line Lu1205, whereas the NG1500K had the highest degree of overlap in probe values between these regions; bottom, the platform-wise trends were similar when comparing a heterozygous loss with a 2n region in WM1366. B, ROC analysis of a copy number gain (Lu1205) and loss (WM1366) across a range of ratio thresholds shows that the AG44K and AG185K chips most readily distinguished aberrant from nonaberrant probes. Both SNP chips were intermediate to the Agilent and NimbleGen assays.

 
Although the NimbleGen array did least well with these metrics, we reasoned that its high-density nature should allow data smoothing over a sliding window of multiple probes. This commonly used process pulls the probe values closer to the overall mean for the window, thereby reducing noise without significant sacrifice on resolution. To this end, we averaged the values of three or seven consecutive probes in a sliding window fashion across the regions and replotted the density distribution for each (Fig. 2A ). Expectedly, the degree to which sliding window smoothing enhanced the separation between aberrant and diploid regions was most profound for the two assays with the lowest signal to noise ratios (AF100K and NG1500K platforms; Fig. 2A). In fact, with a smoothing window of seven probes, NG1500K platform was able to achieve an AUC of ≥0.95 on the resultant ROC curves (Fig. 2B), resulting in an "effective" resolution of 14 kb, down from 2 kb. Similarly, for both AF100K and AF500K platforms, a smoothing window of three consecutive probes enabled them to do comparably with the Agilent platforms, with AUCs of ≥0.97, resulting in effective resolutions of 76.2 and 17.7 kb, respectively.


Figure 2
View larger version (42K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 2. Smoothing over consecutive probes reduced noises and improved performance but resulted in a lower effective resolution. A, improved bimodal separation of signals and noises with smoothing over three (red) or seven (blue) consecutive probes when compared with no smoothing (n = 1; black). Note most dramatic improvement with NG1500K platform. B, with smoothing window of three for AF100K and AF500K, and smoothing window of seven for NG1500K platforms, performances as measured by AUC on ROC curves were improved to levels comparable with those of AG44K and AG185K with no smoothing. Resultant effective resolutions for AF100K and AF500K with smoothing of three probes are 76.2 and 16.4 kb, respectively, and for NG1500K with smoothing of seven probes are 14.0 kb, all within ranges of resolution achievable by Agilent platform despite lower probe density.

 
Detection of focal CNAs. The above comparison highlighted the advantages of optimized probe design and longer oligonucleotide probe length in detecting low-amplitude aberrations where robust signal to noise is required. Next, we examined the performance of these platforms to define focal and high-amplitude CNAs as such would be expected to rely heavily on density of probe coverage. In particular, we focused first on the highly recurrent 9p21 deletion entering on CDKN2A locus. CDKN2A locus, the so-called familial melanoma locus, encodes for p16INK4A and p14ARF, two bona fide tumor suppressors of pRB and p53 pathways that are implicated in multiple human cancer types, including melanomas. Of the seven melanoma cell lines, WM88 was reported in the literature to harbor homozygous deletion of CDKN2A, and three others with heterozygous loss (14). For purpose of our comparison, we considered a deletion true when detected by at least two assays, where a homozygous deletion was estimated by applying a ratio threshold that was two times the optimized single copy deletion threshold as calculated by FPR/FNR analysis (Supplementary Table S4). By this metric, homozygous deletion of CDKN2A was present in four of seven cell lines (Table 3 , bold). Three platforms (AG44K, AG185K, and AF500K) were able to detect this event in all four cell lines, whereas the other two (NG1500K and AF100K) detected three of the four. The AF100K profile missed the CDKN2A deletion in Lu1205 likely due to poor probe coverage (Supplementary Fig. S7). The minimal common region of this deletion in Lu1205 was ~100 kb covered by only 6 probes on the AF100K SNP array, in contrast to 8 to 48 probes by the other microarrays. In the case of WM35, where all other platforms detected a CDKN2A deletion, the NimbleGen platform showed a hemizygous loss, suggesting that the probes surrounding this locus may not offer optimal signal.


View this table:
[in this window]
[in a new window]

 
Table 3. Detection of focal minimal common regions (trough or peak ratios indicated)

 
Copy number amplification in the genomic region surrounding KRAS (12p12.1), a commonly mutated oncogene and member of the RAS gene family, was detected by all assays in the Lu1205 cell line. The minimal common region (~1.5 Mb) was well covered on all of the microarrays, likely contributing to the high level of concordance. In contrast, amplification of SNAI2, a transcriptional repressor associated with neural crest cell development and migration required for metastasis of transformed melanoma cells (17), was identified only by the high-density microarrays (AG185K, AF500K, and NG1500K). The detection failure by the AG44K and AF100K platforms can be explained by poor probe coverage, where these platforms had only one and two probes, respectively, reporting within the 150 kb minimal common region identified by the higher-density platforms. Therefore, high-density coverage across the genome offers an important advantage in detection of focal events.

Genome-wide catalogues of CNAs. Phase 1 study above showed comparable detection of three known CNA events by high-density Agilent and Affymetrix platforms; thus, we next compared these two platforms in cataloguing known and unknown CNAs in a cohort of 18 melanoma cell lines. For this phase 2 comparison, the highest-density microarrays available at the time (i.e., AF500K and AG244K) were used. As before, same preparations of genomic DNA from all 18 cell lines were used for profiling. The Agilent profiles were generated by the authors' laboratory for this part of the study, whereas the Affymetrix profiles were generated by a commercial vendor. All 18 profiles from both platforms were processed by circular binary segmentation algorithm (9) and CNAs were defined as previously reported (see Materials and Methods; refs. 12, 18, 19). Unlike phase 1 of the study, it is not possible to generate a ground-truth CNA list against which to compare performance of these two platforms. Therefore, we limited our analyses here to concordance of high-amplitude CNAs between platforms.

First, we defined a list of CNAs with amplitudes in the top or bottom 2% of all segment values detected by each platform (see Materials and Methods); this translated to log2 thresholds approximating twice of the optimal thresholds for 2-fold signal detection (e.g., log2 ratio >0.868 or <–0.858 versus >0.498 or <–0.471 for AG244K versus AF500K, respectively). As summarized in Table 4 , AG244K platform defined a total of 485 unique CNAs (260 amplifications and 225 deletions) among these 18 cell lines, whereas the AF500K detected 476 CNAs (177 amplifications and 299 deletions). Collectively, 29% (215 of 749) of these unique CNAs were common between the two platforms. Concordance among the amplification events seemed higher (137 of 300, 46%) than that for estimated homozygous deletions (78 of 447, 17%).


View this table:
[in this window]
[in a new window]

 
Table 4. Summary of CNAs detected in 18 melanoma cell lines on AG244K and AF500K platforms

 
We reasoned that the overall low concordance between these platforms could be attributed to the fact that Agilent CGH probes favor coverage in gene-rich regions, whereas the Affymetrix SNP probes are intended to capture allelic difference across the genome. Therefore, nonconcordance in CNA identification within genomic regions that are not well represented on both microarray platforms would be expected. To circumvent this issue, we selected the subsets of CNAs that reside in regions with at least four reporting probes on both platforms, which included 361 CNAs by AG244K and 182 CNAs by AF500K. For the 361 CNAs defined by AG244K, 194 were similarly detected on the AF500K platform, a concordance rate of 54% (in contrast to only 17% concordance among the CNAs in regions covered by three or fewer probes; Table 4). The median width of these 194 CNAs delimited by the AG244K profiles was 0.255 Mb (range, 0.016–10.8 Mb), whereas the median of the same 194 CNAs delimited by AF500K was 0.513 Mb (range, 66 bp to 8.18 Mb). Similarly, of the 182 CNAs defined by AF500K, 123 were also detected on the AG244K platform, a concordance rate of 68% (in contrast to only 14% concordance in regions with poor coverage). The median size of these 123 CNAs as delimited by AF500K was 0.305 Mb (range, 0.029–8.18 Mb), whereas the median of the same 123 CNAs by AG244K was 0.288 Mb (range, 0.007–9.66 Mb). Therefore, within well-covered genomic regions, concordance of CNA detection by each platform was similar.

Our phase 1 comparison on known CNA detection (Table 3) indicated that one important contributing factor to discordance between platforms is the absolute signal of detection by a platform. Thus, we next asked whether the peak log2 ratio among the concordant CNAs were higher than the nonconcordant events. Indeed, among the 182 CNAs defined by AF500K, the average peak log2 ratio for the 123 concordant events was 0.85 as opposed to 0.66 for the 59 nonconcordant CNAs (P = 0.00018, t test). Similarly, among the 361 CNAs defined by AG244K, the average peak log2 ratio for the 194 concordant CNAs was 1.76 as opposed to 1.30 for the 167 nonconcordant CNAs (P = 2.65E–6, t test). In other words, high-amplitude events are more likely to be of higher confidence.

Lastly, given the design differences between these two platforms, we expected that CNAs detectable only by AF500K profiles would be more likely to reside in genomic regions with fewer annotated genes. Indeed, of the 182 AF500K CNAs, an average of 5.1 annotated genes mapped to the 59 nonconcordant events, whereas an average of 11.7 genes resided within the 123 concordant CNAs (P = 0.04, t test). On the other hand, the average number of resident genes in AG244K-defined CNAs that were concordant or nonconcordant with AF500K was not different (7.9 versus 7.6; P = 0.823, t test). Therefore, CNAs targeting genetic elements other than annotated genes are more likely to be missed by the AG244K platform.

Conclusion. Taking advantage of SKY to define large regional events as ground truth for comparison, we were able to determine a set of variables, including reproducibility, absolute signals, and signal to noise as well as ROC curves, with which to compare objectively the robustness of five oligonucleotide microarray-based genome-wide copy number assays from three different platforms. These comparisons convincingly showed that longer oligonucleotide probes optimized for genomic hybridization offer the most robust detection of CNAs. Furthermore, increased density of probe coverage not only improves resolution but also enhances confidence of detection by providing more data points reporting on a particular genomic event. An advantage of Affymetrix platform over Agilent is its broader and more even coverage across the genome, increasing probability of detecting CNAs targeting noncoding genetic elements. On the other hand, Agilent CGH microarrays offer more robust and focal detection of CNAs targeting gene-rich regions. Availability of these data sets should encourage computational algorithm development for improved copy number modeling.


    Acknowledgments
 
Grant support: NIH grants RO1 CA99041, RO1 CA93947, P50 CA93638, and U24 CA126554 (L. Chin).

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.


    Footnotes
 
Note: Supplementary data for this article are available at Cancer Research Online (http://cancerres.aacrjournals.org/).

J. Greshock and B. Feng contributed equally as first author.

6 http://genome.ucsc.edu Back

Received 6/ 6/07. Revised 8/ 7/07. Accepted 8/21/07.


    References
 Top
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 References
 

  1. Weber BL. Cancer genomics. Cancer Cell 2002;1:37–47.[CrossRef][Medline]
  2. Milunsky JM, Huang XL. Unmasking Kabuki syndrome: chromosome 8p22-8p23.1 duplication revealed by comparative genomic hybridization and BAC-FISH. Clin Genet 2003;64:509–16.[CrossRef][Medline]
  3. Look AT, Hayes FA, Shuster JJ, et al. Clinical relevance of tumor cell ploidy and N-myc gene amplification in childhood neuroblastoma: a Pediatric Oncology Group study. J Clin Oncol 1991;9:581–91.[Abstract]
  4. Thompson CT, Gray JW. Cytogenetic profiling using fluorescence in situ hybridization (FISH) and comparative genomic hybridization (CGH). J Cell Biochem Suppl 1993;17G:139–43.
  5. Pinkel D, Segraves R, Sudar D, et al. High resolution analysis of DNA copy number variation using comparative genomic hybridization to microarrays. Nat Genet 1998;20:207–11.[CrossRef][Medline]
  6. Pollack JR, Sorlie T, Perou CM, et al. Microarray analysis reveals a major direct role of DNA copy number alteration in the transcriptional program of human breast tumors. Proc Natl Acad Sci U S A 2002;99:12963–8.[Abstract/Free Full Text]
  7. Snijders AM, Nowak N, Segraves R, et al. Assembly of microarrays for genome-wide measurement of DNA copy number. Nat Genet 2001;29:263–4.[CrossRef][Medline]
  8. Ishkanian AS, Malloff CA, Watson SK, et al. A tiling resolution DNA microarray with complete coverage of the human genome. Nat Genet 2004;36:299–303.[CrossRef][Medline]
  9. Olshen AB, Venkatraman ES, Lucito R, Wigler M. Circular binary segmentation for the analysis of array-based DNA copy number data. Biostatistics 2004;5:557–72.[Abstract]
  10. Brennan C, Zhang Y, Leo C, et al. High-resolution global profiling of genomic alterations with long oligonucleotide microarray. Cancer Res 2004;64:4744–8.[Abstract/Free Full Text]
  11. Bignell GR, Huang J, Greshock J, et al. High-resolution analysis of DNA copy number using oligonucleotide microarrays. Genome Res 2004;14:287–95.[Abstract/Free Full Text]
  12. Zhao X, Weir BA, LaFramboise T, et al. Homozygous deletions and chromosome amplifications in human lung carcinomas revealed by single nucleotide polymorphism array analysis. Cancer Res 2005;65:5561–70.[Abstract/Free Full Text]
  13. Cann HM, de Toma C, Cazes L, et al. A human genome diversity cell line panel. Science 2002;296:261–2.[Medline]
  14. Zhang L, Huang J, Yang N, et al. microRNAs exhibit high frequency genomic alterations in human cancer. Proc Natl Acad Sci U S A 2006;103:9136–41.[Abstract/Free Full Text]
  15. Lin M, Wei LJ, Sellers WR, Lieberfarb M, Wong WH, Li C. dChipSNP: significance curve and clustering of SNP-array-based loss-of-heterozygosity data. Bioinformatics 2004;20:1233–40.[Abstract/Free Full Text]
  16. Schrock E, du Manoir S, Veldman T, et al. Multicolor spectral karyotyping of human chromosomes. Science 1996;273:494–7.[Abstract]
  17. Gupta PB, Kuperwasser C, Brunet JP, et al. The melanocyte differentiation program predisposes to metastasis after neoplastic transformation. Nat Genet 2005;37:1047–54.[CrossRef][Medline]
  18. Aguirre AJ, Brennan C, Bailey G, et al. High-resolution characterization of the pancreatic adenocarcinoma genome. Proc Natl Acad Sci U S A 2004;101:9067–72.[Abstract/Free Full Text]
  19. Kuiper RP, Schoenmakers EF, van Reijmersdal SV, et al. High-resolution genomic profiling of childhood ALL reveals novel recurrent genetic lesions affecting pathways involved in lymphocyte differentiation and cell cycle progression. Leukemia 2007;21:1258–66.[CrossRef][Medline]



This article has been cited by other articles:


Home page
Neuro OncologyHome page
H.-Q. Qu, K. Jacob, S. Fatet, B. Ge, D. Barnett, O. Delattre, D. Faury, A. Montpetit, L. Solomon, P. Hauser, et al.
Genome-wide profiling using single-nucleotide polymorphism arrays identifies novel chromosomal imbalances in pediatric glioblastomas
Neuro Oncology, October 15, 2009; (2009) nop001v1.
[Abstract] [Full Text] [PDF]


Home page
Brief Funct Genomic ProteomicHome page
N. S. Nagaraj
Evolving 'omics' technologies for diagnostics of head and neck cancer
Brief Funct Genomic Proteomic, March 9, 2009; (2009) elp004v1.
[Abstract] [Full Text] [PDF]


Home page
Cancer Res.Home page
B. P. Coe, C. MacAulay, W. L. Lam, B. Ylstra, B. Carvalho, and G. A. Meijer
Comment re: A Comparison of DNA Copy Number Profiling Platforms
Cancer Res., May 15, 2008; 68(10): 4010 - 4010.
[Full Text] [PDF]


Home page
Molecular Cancer TherapeuticsHome page
J. Greshock, J. Cheng, D. Rusnak, A. M. Martin, R. Wooster, T. Gilmer, K. Lee, B. L. Weber, and T. Zaks
Genome-wide DNA copy number predictors of lapatinib sensitivity in tumor-derived cell lines
Mol. Cancer Ther., April 1, 2008; 7(4): 935 - 943.
[Abstract] [Full Text] [PDF]


Home page
J. Pharmacol. Exp. Ther.Home page
E. Yang, R. R. Almon, D. C. DuBois, W. J. Jusko, and I. P. Androulakis
Extracting Global System Dynamics of Corticosteroid Genomic Effects in Rat Liver
J. Pharmacol. Exp. Ther., March 1, 2008; 324(3): 1243 - 1254.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Supplementary Data
Right arrow All Versions of this Article:
0008-5472.CAN-07-2102v1
67/21/10173    most recent
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Greshock, J.
Right arrow Articles by Chin, L.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Greshock, J.
Right arrow Articles by Chin, L.


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
Cancer Research Clinical Cancer Research
Cancer Epidemiology Biomarkers & Prevention Molecular Cancer Therapeutics
Molecular Cancer Research Cancer Prevention Research
Cancer Prevention Journals Portal Cancer Reviews Online
Annual Meeting Education Book Meeting Abstracts Online