The development and use of molecular-based therapy for breast cancer and other human malignancies will require a detailed molecular genetic analysis of patient tissues. The recent development of laser capture microdissection and high density cDNA arrays now provides a unique opportunity to generate gene expression profiles of cells from various stages of tumor progression as it occurs in the actual neoplastic tissue milieu. We report the combined use of laser capture microdissection and high-throughput cDNA microarrays to monitor in vivo gene expression levels in purified normal, invasive, and metastatic breast cell populations from a single patient. These in vivo gene expression profiles were verified by real-time quantitative PCR and immunohistochemistry. The combined use of laser capture microdissection and cDNA microarray analysis provides a powerful new approach to elucidate the in vivo molecular events surrounding the development and progression of breast cancer and is generally applicable to the study of malignancy.
The elucidation of the genetic events underlying the initiation and progression of human breast cancer has been hampered by limitations inherent to both in vitro and in vivo methods of study. The most significant limitation of in vitro-based systems is that genetic information derived from cell lines may not accurately reflect the molecular events taking place in the actual tissue milieu from which they were derived. On the other hand, in vivo genetic analysis of breast cancer has been limited by our inability to directly and specifically procure pure populations of cells from complex heterogeneous tissue (1) . The recent development of LCM, 3 a technique that allows for the rapid, reliable, and accurate procurement of cells from specific microscopic regions of tissue sections under direct visualization, now affords the opportunity to perform molecular genetic analysis of pure populations of malignant breast cells in their native tissue environment (2) . This technical advance helps overcome the limitations associated with traditional in vivo and in vitro approaches.
The advent of high-density cDNA microarray technology (3) , with its capacity for simultaneous monitoring of thousands of genes, provides a unique opportunity for high-throughput genetic analysis of cancer. Although most current microarray studies have been performed with in vitro-derived genetic material from both mammalian and nonmammalian systems (4, 5, 6) , a major leap in functional genomic investigations would be the ability to perform array-based expression analysis with in vivo-derived genetic material originating from morphologically distinct cellular subpopulations within neoplastic tissue. Here we report the first application of combining LCM and cDNA microarray technologies to analyze gene expression in a clinical cancer specimen. Furthermore, we demonstrate that expression profiles of greater than 8000 genes can be successfully generated using nonamplified RNA derived from distinct cell populations within several different morphological stages of human breast cancer progression. Expression profile data were verified by real-time quantitative PCR and immunohistochemistry. Our results indicate that high-throughput in vivo gene expression analysis can be achieved and should be of value in elucidating the genetic events associated with breast cancer progression.
Materials and Methods
All tissue used for this study was obtained from a modified radical mastectomy specimen from a single patient. Tissue in excess of what was necessary for diagnostic purposes was obtained <15 min after removal from the patient, embedded in TissueTek OCT medium (VWR Scientific Products Corporation, San Diego, CA), and frozen in liquid nitrogen. The tissues were sectioned at 8 μm in a cryostat, mounted on uncoated glass slides, and immediately stored at −80°C. Slides containing frozen sections were immediately fixed in 70% ethanol for 30 s, stained with H&E, followed by 5-s dehydration steps in 70, 95, and 100% and a final 5-min dehydration step in xylene. Once air-dried, the sections were laser microdissected with a PixCell I and II LCM system from Arcturus Engineering (Mountain View, CA). Following the standard protocol of Emmert-Buck et al. (2) , ∼0.5 × 105 morphologically normal breast epithelial cells, malignant invasive breast carcinoma cells, and malignant metastatic (to an axillary lymph node) breast carcinoma cells were “laser captured.” Each population was estimated to be >98% “homogeneous” as determined by microscopic visualization of the captured cells.
RNA Extraction from Microdissected Samples.
The total RNA from each population of laser captured cells was independently extracted by means of a modification of the RNA microisolation protocol as described (2) . Briefly, the transfer film and adherent cells were incubated with guanidinium isothiocyanate buffer at room temperature, extracted with phenol/chloroform/isoamyl alcohol, and precipitated with sodium acetate and glycogen carrier (10 μg/μl) in isopropanol. After initial recovery and resuspension of the RNA pellet, a DNase step was performed for 2 h at 37°C using 10 units of DNase (GenHunter, Nashville, TN) in the presence of 10 units of RNase inhibitor (Life Technologies, Inc., Gaithersburg, MD), followed by reextraction and precipitation. The pellet was resuspended in 27 μl of RNase-free H2O; one-third (9 μl) of the total RNA from each sample was used for RTQ-PCR analysis, and the remaining two-thirds (18 μl) were used for high-density cDNA array analysis.
RNA Labeling and Hybridization.
For each labeling, total RNA corresponding to ∼1.7–2.0 × 104 cells was reverse-transcribed in the presence of 50 μCi of [33P]dCTP, 50 μCi of [33P]dATP, 500 ng of Oligo-dT, and 200 units of SuperScript II RT (Life Technologies, Inc.). The second strand was synthesized in the presence of 50 μCi of [33P]dCTP, 50 μCi of [33P]dATP, 500 ng of random hexamers, and 2500 units of large fragment DNA polymerase I (Life Technologies, Inc.). The labeled, double-stranded cDNA was denatured and hybridized to the cDNA GeneFilter arrays as follows. The GeneFilters were prehybridized at 42°C in a roller oven (Hybaid; Midwest Scientific, St. Louis, MO) with 1.0 μg/ml poly-dA (Research Genetics, Inc, Huntsville, Al) and 1.0 μg/ml Cot1 DNA (Life Technologies, Inc.) in 5 ml of Microhyb solution (Research Genetics, Inc.) for at least 2 h. After an overnight hybridization with the radiolabeled probe, the filters were washed twice at 50°C in 2× SSC (1× SSC, 15 mm trisodium citrate, and 150 mm NaCl), 1% SDS for 20 min and once at room temperature in 0.5× SSC, 1% SDS for 15 min. The filters were then exposed overnight to a Packard screen and scanned at 50-μm resolution in a phosphorimager instrument (Cyclone Instrument from Packard, Inc.). After each hybridization, the filters were stripped by boiling in 0.5% SDS solution and scanned for residual leftover hybridization.
The tiff images resulting from the phosphorimager were directly imported into the image analysis software Pathways (Research Genetics, Inc.). The software uses control spots present throughout the filter to align the images and performs autocentering, which aligns and centers well-shaped spots and deforms the calculated grid around spots that have a high confidence factor. When comparing two images, the software normalizes the two different hybridizations on the basis of the average total intensity on each filter. The software locates, calculates, and stores each cDNA spot intensity from each tiff file and simultaneously compares two different normalized tiff images. The differential expression ratios represent the average of two independent experiments.
Microarray cDNA Filters.
The clone selection was based on the criteria that the clones: (a) contain the 3′ untranslated region; (b) are of average size (∼1 kb); and (c) originated from oligo-dT primed libraries. These selected clones have been sequence verified at the sequencing facilities of Research Genetics. All of these clones are from the IMAGE libraries. After PCR amplification, 10 ng of insert cDNA was printed on a charged nylon membrane by a custom-made robot. Genes (n = 5184) were spotted on a 5 × 7-cm nylon membrane. Another 576 spots consisted of total genomic DNA, which served as reference points for the image analysis software, for normalization purposes, and for verifying the homogeneity of the hybridization. The GF211 GeneFilter contained 4000 named genes, and the CBGF contained 2800 ESTs and 2384 named genes. GF211 and CBGF shared 1100 cDNAs in common; thus, the total number of genes scanned was 8084.
One-third of the same total RNA pool used for the GeneFilter hybridizations was reverse-transcribed using 50 μg/ml oligo(dT), 500 μm deoxynucleotide triphophosphate, and 200 units of Superscript II reverse transcriptase (Life Technologies, Inc.) for 1 h at 37°C, and the resulting first-strand cDNA was diluted and used as template for the following RTQ-PCR analysis. Sequences for genes identified using array technology were determined by direct sequence analysis and confirmed using National Center for Biotechnology Information (NCBI) GenBank and Unigene databases. The specificity of amplicon sequence selection was determined using two methods: (a) primer and probe sequences that specifically detect the experimental gene sequence, as determined by means of the NCBI Blast module, were used; (b) amplicons generated during the PCR reaction were analyzed using the first derivative primer melting curve software supplied by Perkin-Elmer/Applied BioSystems. Analysis of gene expression was generated using an ABI Prizm 7700 Sequence Detection System (TaqMan), which uses the 5′ nuclease activity of Taq DNA polymerase to generate a real-time quantitative DNA analysis assay (7 , 8) . A nonextendable oligonucleotide hybridization probe with 5′ fluorescent and 3′ rhodamine (quench) moieties is present during the extension phase of the PCR. Degradation and release of the fluorescent moiety attributable to the 5′ nuclease activity results in peak emission at 518 nm and is monitored every 8.5 s by a sequence detector. The increase in fluorescence is monitored during the complete amplification process (real-time). A relative standard curve representing four 4-fold dilutions of breast stock cDNA (1:2.5, 1:10, 1:40, and 1:160) was used for linear regression analysis of unknown samples. The expression of the housekeeping gene, cyclophilin 33A, was used to normalize for variances in input cDNA. The sequences of the PCR primer pairs and fluorogenic probe (5′ to 3′), respectively, that were used for each gene are as follows: cyclophilin 33, GCTGCCTGTGCACTCATGAA, CAGTGCCATTGTGGTTTGTGA, and 6FAM-ATCACCGCCCTGGCACATGA-ACTG-TAMRA; apolipoprotein D, GAGAAGATCCCAACAACCTTTGA, TGATCTTTCCGTTTTCCATTAGTGA, and 6FAM-ATGGACGCTGCATCCAGGCCAACTA-TAMRA; heat shock factor 1, CCTGCAGGTTGTTCATAGTCAGAA, TCCGTCCATCCACTGTG-TGTATA, and 6FAM-ACACAACTGTCCCGTTCCCCGCTC-TAMRA; BRCA-1, GGCTATGCAAGG-GTCCCTTA, TGGTGGCGTTTAAATGGTTTT, and 6FAM-TCTCCCTTGGAAATCTGCCATGAGC-TAMRA; SWI/SNF, GGCTGGGAGGACTGGTGTT, TTTCCAAACCTGCCAGAAGTG, and 6FAM-AAGCCCTAGGCCCACCCTCCTCA-TAMRA; and β-adrenergic receptor kinase 1, GGC-TCCTGTGCCCTTATTCAG, CTGCCAATGCCACTCTCTCA, and 6FAM-ACTCCCACTTCCCTGACACTGCGG-TAMRA. The fluorogenic probes are FAM and TAMRA.
Immunohistochemical staining of frozen tissue sections (8 μm) adjacent to those slides used for LCM were mounted on slides and fixed in 10% neutral buffered formalin for 8 min. The slides were preincubated with mouse serum (1:50 dilution) for 20 min at room temperature to block nonspecific binding and incubated with the anti-apolipoprotein D antibody 8CD6 (Signet Laboratories, Dedham, MA) at a 1:40 dilution for 20 min at room temperature. The slides were washed three times in PBS, incubated with PBS/0.3% H2O2 for 30 min, and washed three times in PBS. Sections were incubated with biotinylated anti-mouse antibody (Vector Laboratories, Burlingame, CA), washed in PBS, incubated with the ABC reagent (Vector Laboratories) for 1 h, washed, and developed according to the manufacturer’s recommendations. The tissue was postfixed in 4% formalin and counterstained with hematoxylin.
Results and Discussion
Microdissection and cDNA Array Analysis.
Although the combined use of LCM and cDNA arrays provides a unique opportunity to study gene expression of subpopulations of cells in their native (in vivo) tissue environment, such technologies have not been applied to clinical cancer specimens. To demonstrate the feasibility of integrating and applying these technologies to such specimens, we examined the differential gene expression between normal and malignant breast epithelial cells in a single clinical breast cancer specimen. Normal breast epithelial cells, invasive cells, and metastatic breast cancer cells (∼1 × 105 cells from each target population) were cleanly captured by LCM (Fig. 1) ⇓ , and total RNA was isolated. One-third of the isolated RNA was set aside for quantitative PCR validation studies (see below), and the remaining two-thirds were used for the generation of radiolabeled probe. This fraction of the RNA from each target population was divided in two, and a total of six (two for each target cell type) independent radiolabeling reactions were performed. To ensure that the cDNA products are proportional to the initial gene expression profile, we generated a 33P-labeled microarray probe by oligo(dT) direct reverse transcription of total RNA, followed by second-strand cDNA synthesis. Probe derived from each target cell was simultaneously hybridized to two different microarray nylon filters, (designated GF211 and CBGF; see “Materials and Methods” for filter designs) containing a combined total of 8084 cDNAs. To avoid system variability that may be associated with the use of different filters, we performed sequential hybridizations on the same set of filters with each of the different cDNA probes. Furthermore, each hybridization to these two GeneFilters was performed in duplicate. Each hybridized filter was scanned with a phosphorimager, the resulting tiff file images were obtained, and comparative analysis was performed with Pathways custom image analysis software. The quality of the microarray hybridizations is demonstrated with the CBGF in which the same GeneFilter was sequentially hybridized with probes derived from normal, invasive, and metastatic cells (Fig. 2A) ⇓ ; comparable data were obtained with the GF211 filters. To determine whether a hybridization signal corresponding to a particular cDNA is reproducible between GeneFilters, we compared duplicate gene expression profiles for each of three different cell types (normal, invasive, and metastatic cells). With this approach, <0.15% variability in gene expression was observed between normalized duplicate hybridizations:
Comparative differential gene expression analysis of normal cells versus invasive cells revealed that 90 genes had significantly altered levels of expression by 2-fold or greater; 22 and 68 genes were found to be differentially expressed in the GF211 and CBGF, respectively. Identical analysis of normal cells versus metastatic cells demonstrated 23 genes differentially expressed in GF211 and 89 genes in CBGF. Of these genes, 4 from GF211 and 25 from CBGF are differentially expressed in both invasive and metastatic breast carcinoma cells, as compared with normal cells. Furthermore, of the 202 genes that are differentially expressed (in both invasive and metastatic cells), 83 are ESTs or ESTs that demonstrate varying degrees of similarity to known genes. Interestingly, the number of differentially expressed genes identified with the CBGF was approximately three times greater than that identified with the named GeneFilter (GF211), emphasizing the potential advantage in using tissue-specific arrays for expression profiling analysis. The original hybridization spots for a subset of differentially expressed genes, as well as three nondifferentially expressed cyclophilin genes, demonstrated a broad range of hybridization intensities (Fig. 2B) ⇓ . Overall, the alterations in gene expression ranged from approximately −40- to 8-fold; the differential expression data for highly expressed genes was readily assessed visually using the pseudo-color overlay image generated by the Pathways software as shown in Fig. 2A ⇓ . A partial list of genes that are differentially expressed in invasive and metastatic cells is shown in Table 1 ⇓ . The differentially expressed genes demonstrate a broad range of functional activity. Although many of these genes have been implicated in various aspects of tumor biology, few have been demonstrated to be associated with breast cancer and include apolipoprotein D (9) , annexin I (10 , 11) , tissue factor (12 , 13) , RANTES (14 , 15) , and BRCA1 (16 , 17) . Overall, with the exception of BRCA1, our in vivo transcript data generated through the combined use of LCM and high-density cDNA array analysis are consistent with that reported in the literature. It is tempting to speculate on the potential role of many of the other differentially expressed genes. For example, 53BP2, which is down-regulated in both invasive and metastatic cells as compared with normal breast epithelium (Fig. 2B) ⇓ , has been demonstrated to bind bcl-2 and p53 and to impede cell cycle progression at G2-M (18) . Additionally, the protein BAF60, a component of the SWI/SNF complex, is up-regulated in both invasive and metastatic cells (Fig. 2B) ⇓ . Interestingly, the SWI/SNF complex has been demonstrated to enhance nuclear receptor-mediated transcriptional activation, including that associated with the estrogen receptor and retinoic acid receptor (19) . However, what role, if any, these genes or any other gene in Table 1 ⇓ may play in the pathogenesis of breast cancer remains to be seen and will be the subject of further investigation.
Validation of Array Data with RTQ-PCR and Immunohistochemistry.
Interestingly, analysis of the gene expression profile data with the Pathways image software revealed that the increase and decrease of expression pattern for apolipoprotein D was consistently observed in the invasive and metastatic cells for three different apolipoprotein D cDNA spots (one in GF211 and two in CBGF). To further investigate the reliability of our array data, we measured the expression levels of 5 of the 202 differentially expressed genes using the Taqman 5′ nuclease fluorogenic quantitative PCR assay (RTQ-PCR). To obtain truly comparable results, the third fraction of the original total RNA (from the same batch that was used for the array hybridizations) was used as a template in the RTQ-PCR reactions. Fig. 3A ⇓ demonstrates that the differential expression pattern and the quantitative expression level of each of the five genes as determined by RTQ-PCR were similar to those observed with cDNA arrays in 8 of 10 expression data points, confirming the reliability of our array expression profile data. Our observed correlation between the cDNA array and RTQ-PCR data are consistent with that observed by others (4 , 23 , 24) .
As an additional means to confirm our data at the protein level, we performed immunohistochemical analysis of apolipoprotein using tissue sections that were adjacent to those used for laser microdissection. Paralleling the differential expression pattern observed with the cDNA microarray and RTQ-PCR analysis, the invasive cells demonstrated abundant and strong immunoreactivity for apolipoprotein D, whereas the metastatic cells demonstrated rare and weak immunoreactivity (Fig. 3B) ⇓ . This result further supports the reliability of our expression data and demonstrates the cellular specificity of the apolipoprotein gene expression. Overall, the RTQ-PCR and immunohistochemistry results support the feasibility of our microarray experimental protocol as a means to assess in vivo transcript expression profiles.
Although two studies, one of which also included the use of LCM, have reported the use of cDNA arrays to study gene expression in tissues, our approach has several novel features.
(a) a single microarray profile in our study reflects gene expression that corresponds to a specific population of epithelial cells independent of contaminating stromal cells. By contrast, previous studies used genetic material derived from (nondissected) bulk tissue specimens that are composed of both malignant and normal cells (25) . Therefore, each individual microarray profile from bulk tissue reflects gene expression that corresponds to malignant cells as well as to many different types of contaminating normal cells in the cancer specimen.
(b) We generated probes directly without amplification to avoid possible representational bias that may be associated with amplification schemes, whereas Luo et al. (23) used a T7-based RNA amplification method to generate probes for their microarrays.
(c) By analyzing breast cancer progression, which reflects genetic alterations over time, we performed both spatial and temporal in vivo expression profiling. The previously mentioned studies performed expression profile analysis on tissues that were spatially but not temporally distinct (23 , 24) .
Using carefully controlled conditions, we demonstrated that in vivo subpopulations of malignant cells from multiple stages of breast cancer progression can be simultaneously screened for thousands of genes. We now report the feasibility of combining LCM and high-throughput cDNA arrays to study in vivo gene expression profiling, and we illustrate through the use of duplicate hybridizations, RTQ-PCR analysis, and immunohistochemistry that this approach produces reproducible and valid data. We believe that this in vivo functional genomic approach not only provides an evolving opportunity to rapidly and directly monitor in vivo gene expression in human breast cancer but also promises to provide novel insights into fundamental cancer biology. Furthermore, the application of this approach to clinical cancer specimens may provide a key step to rapid advances in cancer prevention, detection, diagnosis, and therapeutics.
We thank D. Haber, D. Krizman, L. Smith, and I. Stamenkovic for helpful discussions and Bruce Kaynor for technical assistance. Special thanks to B. Smith for providing tissue samples.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
↵1 This work was supported by a grant from the Massachusetts Department of Public Health Breast Cancer Program, a collaborative grant from the Dana-Farber/Partners Cancer Care Women’s Cancer Program, and Grant IRG-173H from the American Cancer Society (to D. C. S.).
↵2 To whom requests for reprints should be addressed, at Microarrays Department, Research Genetics, Inc., 2700 Memorial Parkway, Huntsville, AL 35801. E-mail:
↵3 The abbreviations used are: LCM, laser capture microdissection; RTQ-PCR, real-time quantitative PCR; CBGF, Custom Breast GeneFilter; EST, expressed sequence tag; FAM, 6-carboxyfluorescein; TAMRA, 6-carboxytetramethylrhodamine.
- Received July 8, 1999.
- Accepted October 4, 1999.
- ©1999 American Association for Cancer Research.