The gene expression patterns of desmoplasia are becoming exposed through the application of global gene expression technologies such as cDNA microarrays or serial analysis of gene expression (SAGE). These patterns represent the sum of the many cellular components of the host stromal response to an infiltrating carcinoma. In studies of human neoplasms, it would be useful to identify those prototypical genes that characteristically indicate the recognizable forms of the responses to individual tumor types. Such genes may offer clues to better understand the process of invasion itself, the interactions between tumor and host cells, and tumor-specific differences in invasion. We used SAGE-defined genes and in situ transcript labeling to characterize the desmoplastic stroma induced by infiltrating ductal carcinomas of the breast. Principal component analysis identified 103 SAGE tags as specific for invasive breast carcinomas, in comparison with in situ duct carcinomas or normal breast epithelium. Of these, 68 tags corresponded to known genes. Six of the 68 genes from this breast cancer “invasion-specific” cluster were further characterized by in situ hybridization to breast cancer tissues. Results of in situ hybridization demonstrated that each gene was expressed within one of five distinct regions of the invasive tumors (neoplastic epithelium; angioendothelium; inflammatory, panstromal, and juxtatumoral stroma), reflecting a defined architectural structure to the transcriptome of invasive breast cancers. Two of these 6 genes were specifically expressed by the stromal cells within the invasive carcinoma; however, 1 (collagen 1α1) was expressed throughout the stromal response (panstromal expression), whereas the second (osteonectin) was specifically expressed within the juxtatumoral stromal cells, indicating a critical “regionality” of gene expression within the stromal response itself. A comparison of the gene expression profiles of the juxtatumoral stroma in breast and pancreatic carcinomas indicated important differences between the two, suggesting tumor-specific or organ-specific differences in the desmoplastic responses. Some of the genes presented are novel markers of the invasive process, imply communication at the host/tumor interface, and suggest potential therapeutic targets.
The ability of malignant neoplasms to invade adjacent normal tissues is fundamental to the neoplastic process. For many epithelial neoplasms, this process includes the ability to induce a desmoplastic response of the host tissues at the site of primary invasion. This host stromal response is a result of a complex interaction between the host and invading neoplasm, comprising fibroblasts, various inflammatory cells, proliferating vascular structures, as well as normal parenchymal cells undergoing atrophy at the invasive edge.
Recent investigations into the host desmoplastic response to infiltrating pancreatic adenocarcinoma have identified or refined an architectural organization of gene expression within this host response (1 , 2) . Ryu et al. (1) identified an invasion-specific cluster of genes when comparing SAGE 3 libraries of primary carcinomas to those derived from passaged cancer cell lines. Many of the genes identified were found to be markers of the exuberant host stromal response present in infiltrating pancreatic cancers, representing distinct classes of genes with differing cellular functions. In situ hybridization, using 12 of these invasion-specific genes as probes, illustrated how gene expression patterns are partitioned into spatial compartments within the desmoplastic response to the tumor cells, including a distinct “juxtatumoral stroma,” a region of the host response thought to be important for tumor-host interactions (2) . These genes, and their associated architectural compartments of gene expression, represent potential new targets for diagnostic screening or for therapeutic development.
We proposed that similar study of the desmoplastic response to an infiltrating carcinoma might be useful in understanding the molecular biology of other tumor types such as infiltrating duct carcinomas of the breast, which also characteristically can produce a prominent host stromal response (3) . We applied principal component analysis to a set of SAGE libraries generated from normal and neoplastic breast cancer tissues and cell lines and characterized those genes identified by in situ hybridization: (a) we determined whether the gene expression characteristic of the desmoplastic response to breast cancer is similar to that found in response to other epithelial neoplasms; (b) we examined whether spatially defined regions of gene expression exist among the desmoplastic responses to breast cancers, with specific attention to the juxtatumoral stroma; and (c) we sought to determine whether specific genes potentially important in the desmoplastic response to one tumor type may play a similar role in other epithelial neoplasms as well.
MATERIALS AND METHODS
SAGE data of 11 breast-derived libraries were acquired from the CGAP database available in the NCBI SAGEmap database. 4 The SAGE libraries of the two normal tissues, two ductal carcinomas in situ, and two invasive cancers and their matched lymph node metastases were prepared and sequenced as described in detail by Porter et al. (4) . Breast cancer cell line libraries LacZ, MCF70 h, and MDA453 were included in the analyses to aid in determination of invasion-specific gene expression. A total of 88,178 unique SAGE tags, which were identified among 467,742 total tags sequenced from 11 breast SAGE libraries, were used for all subsequent analyses.
The Cluster and TreeView computer programs were obtained from the online resource 5 and used for PCA and visualization of tree diagrams (5) . SAGE data were filtered as follows. Exclusion was applied to tags when fewer than two samples contained at least 5 tags in the raw data and when minimum and maximum values among all samples differed by <4 tags. This produced a dataset of 2,575 tags from an original 88,178 unique tags. The data were imported into the Cluster program and log-transformed, and PCA was performed. The names of genes and ESTs that matched the tag sequences were obtained using an online resource from NCBI. 6
Paraffin-embedded tissues of four samples of infiltrating duct carcinoma of the breast were obtained from the files of The Johns Hopkins Hospital. For each case, one representative section was chosen that contained invasive carcinoma and normal duct and lobule structures on the same slide. Three of the four cases also contained high-grade DCIS within the same paraffin section. All four carcinomas were Elston grade II/III (6) .
Nonradioactive in Situ Hybridization of Paraffin Sections.
To generate riboprobes for use in in situ hybridization of genes of interest, DNA templates were generated by PCR with incorporation of a T7 promoter into the antisense or sense primer (7) . After phenol:chloroform purification of amplified DNA, 200 ng of the DNA templates were used to generate either antisense or sense riboprobes by in vitro transcription with digoxigenin labeling reagents and T7 polymerase according to the manufacturer’s protocol (Roche Diagnostics, Indianapolis, IN).
In situ hybridization of paraffin-embedded tissues were performed following methods modified from Kadkol et al. (8) . Five-μ-thick sections were cut from the paraffin blocks, deparaffinized in xylene, and hydrated in graded concentrations of ethanol for 5 min each. Sections were incubated with 1% hydrogen peroxide, followed by digestion in 10 μg/ml of proteinase K at 37°C for 30 min. Sections were hybridized overnight at 15–25°C below the Tm calculated for each individual riboprobe with a 200 ng/ml dilution of either antisense or sense riboprobes in mRNA hybridization buffer (DAKO, Carpinteria, CA). The following day, sections were washed in 2× SSC (0.3 m sodium chloride and 0.03 m sodium citrate) and incubated with a 1:35 dilution of RNase A cocktail (Ambion, Austin, TX) in 2× SSC for 37°C. Next, sections were stringently washed in 2× SSC/50% formamide twice, followed by one wash at 0.08× SSC at 5–8°C below the calculated Tm. For signal amplification, a horseradish peroxidase-conjugated rabbit anti-digoxigenin antibody (DAKO) was used to catalyze the deposition of biotinyl-tyramide, followed by secondary streptavidin complex (GenPoint kit; DAKO). The final signal was developed with 3,3′-diaminobenzidine chromagen (GenPoint kit; DAKO), and the tissue was counterstained in hematoxylin for 15 s.
Histological Evaluation of Tissue Sections.
In situ hybridization labeling of mRNA expression in samples of paraffin-embedded pancreatic carcinoma was evaluated by three of the authors (C. A. I-D., P. A., and S. E. K.) with agreement in all cases examined. For each case, the labeling pattern obtained following in situ hybridization was evaluated for the presence or absence of gene expression individually within the normal duct epithelium, duct carcinoma in situ (if present), and infiltrating duct carcinoma. Expression was also evaluated within the desmoplastic stroma of the neoplasm and the vasculature within the normal tissues and tumor mass. In those cases having positive expression noted within the tumor stroma, gene expression was scored as occurring within the entire stromal region of the tumor or in the stroma immediately adjacent to tumor epithelium (juxtatumoral stroma).
RESULTS AND DISCUSSION
One-way hierarchical clustering was used to examine the relationships among the 11 SAGE libraries samples based on their global gene expression profiles (Fig. 1) ⇓ . A dendrogram created by this analysis indicated that breast cancer tissues and their matched metastases were more similar to each other than to other breast cancer tissues or cell lines. Samples of normal breast epithelium also clustered together on a terminal branch of the dendrogram, and samples of DCIS (DCIS and DCIS2) were arranged between these two groups. The hierarchical clustering analysis also confirmed the expected dissimilarity of gene expression between the breast tissue libraries and breast cancer cell lines. Two of the three cancer cell lines clustered together on an independent arm of the dendrogram. However, one cell line, MCF70 h, clustered more closely to the tissue-derived breast samples than to the other cell lines analyzed. These findings are in agreement with similar reports (4 , 9) but also formed the basis for additional analyses to identify invasion-specific gene expression in breast cancers.
Principal Component Analysis.
PCA can provide a global overview of the relatedness of gene expression profiles among samples while better avoiding the deterministic and rather arbitrary nature of hierarchical clustering. We therefore used PCA to delineate the gene cluster that distinguished the invasive breast cancer specimens from all others in this SAGE library dataset (Fig. 2) ⇓ . A cluster of genes specific for, and highly expressed in, invasive breast cancer tissues (primary carcinomas and matched metastases) was identified. This cluster thus identified the “invasion-specific genes” of breast carcinomas, as defined by comparison with breast cancer cell lines or samples of DCIS. This gene cluster is not to be confused with tumor-specific genes (present in both invasive breast cancer tissues and cell lines) but instead includes the gene expression associated with the presence of the host stromal response present within samples of infiltrating duct carcinoma. These genes may represent stromal gene expression or the expression of certain genes within the neoplastic epithelium as a function of tumor-stromal interactions (10 , 11) .
A smaller yet distinct cluster of genes was also identified to correspond to normal duct epithelium, similar to that described previously in a normal breast duct epithelium gene cluster (4) . Breast cancer cell lines and DCIS samples were less well delineated from the other samples and did not show distinct gene clusters by PCA. However, because our goal was to investigate the host response to breast cancer, we directed our attention solely to the invasion-specific gene cluster.
Genes Characteristic of Invasive Breast Cancer.
Table 1 ⇓ contains the identities of the SAGE tags and their frequency of appearance in the invasion-specific gene cluster of breast duct carcinoma. Among 103 tags, 68 matched to known transcripts, and 35 might include novel genes. A comparison of the genes identified in our invasion-specific cluster to that of Porter et al. (4) revealed several similarities, with 15 of the genes identified in their analysis also being identified by our methods (Table 1) ⇓ . However, several genes were identified in our cluster that were not reported (e.g., thymosin β4, apolipoprotein E, laminin receptor 1, and IGFBP7), suggesting the contrasting of primary tumors to breast cancer cell lines and analysis by PCA may be more appropriate for the identification of invasion-specific gene expression.
Infiltrating carcinomas are often associated with a dense fibrous host stromal reaction to the neoplasm, known as desmoplasia. At the advancing edge of the infiltrating carcinoma, there is often entrapment of normal duct and lobule structures, as well as foci of residual duct carcinoma in situ present within the original site of neoplastic formation. Inflammatory cells may also represent a proportion of the cellularity of the mass and are usually found at the advancing edges of the neoplasm as it invades through normal structures. The genes identified within this invasion-specific cluster are therefore best categorized with an understanding of the variety of cell types that constitute the primary site of invasion within breast carcinomas.
Genes identified within this cluster reflected the presence of various components of the host stromal response, including extracellular matrix remodeling (e.g., collagen 1α1; Ref. 12 ), angiogenesis (e.g., IGFBP7 and osteonectin; Ref. 13 ), the immune response (e.g., immunoglobulin heavy chain γ3; Refs. 9 , 14 ), increased proliferation (cdk inhibitor 3 and SMC4-like 1; Refs. 9 , 14 ), or elevated transciptional demands (ribosomal proteins; Ref. 15 ). Relatively few genes identified within this breast invasion-specific gene cluster, however, were also present within the invasion-specific gene cluster characteristic of pancreatic cancer (1) . Genes that were found in both invasion-specific gene clusters included apolipoprotein C-1, osteonectin, and collagen 1α1, suggesting that some genes may play a universal role in the host stromal response to infiltrating cancer. However, most genes identified in the breast invasion-specific gene cluster were not identified in the invasion-specific cluster of the pancreas, and vice versa. It was thus possible that invasion-specific gene expression might relate to the primary organ in which the host stromal response occurs. Alternatively, because the identification of invasion-specific genes by PCA predominantly reflects quantitative changes in expression, the differences in desmoplastic gene expression between these two tumor types might primarily reflect the relatively more exuberant host stromal response to pancreas cancer, with a more cellular host response perhaps being represented in SAGE libraries from those tumors. These possibilities were addressed by in situ studies.
In situ Hybridization of Selected Invasion-specific Genes.
Because invasive breast carcinomas represent an aggregate of diverse cell types, the precise cellular origin of these transcripts cannot be determined without additional study. To define the cellular origin and patterns of expression of these genes associated with the host stromal response to breast cancer, 6 genes were selected for further study of their expression in invasive breast carcinoma tissues by in situ hybridization (Fig. 3) ⇓ . These gene expression markers were selected for their presumed role in the host stromal response, such as new vessel formation (IGFBP7 and osteonectin; Ref. 13 ), fibroblastic proliferation (collagen 1α1 and apoliproprotein C-1; Ref. 1 ), extracellular matrix remodeling (collagen 1α1 and laminin receptor 1; Refs. 1 , 16 ), or the inflammatory response (fusin; Ref. 17 ).
In situ hybridization was performed for each of the 6 invasion-specific genes on four paraffin-embedded tissue samples obtained from mastectomy specimens removed for infiltrating duct carcinoma of the breast. Detectable expression of all 6 genes was observed in all four neoplasms. For each gene, detectable expression was found to localize to one or more of five distinct architectural regions, or “gene expression compartments,” of the invasive tumors: (a) neoplastic epithelium; (b) angioendothelium and/or vascular smooth muscle; (c) juxtatumoral stroma (i.e., only those stromal cells immediately adjacent to the invasive neoplastic epithelium); (d) panstromal tissue (i.e., all areas of stromal tissue of the invasive tumor); or (e) inflammatory cells within the invasive focus.
Four of the 6 genes were expressed within a single architectural compartment in the four samples of invasive cancer (Fig. 3) ⇓ . Expression of laminin receptor 1 was localized to neoplastic epithelium, with no additional expression noted in the surrounding stromal or angioendothelial compartments. In contrast, collagen 1α1 gene expression was observed throughout the stromal response (panstromal), whereas the neoplastic epithelial and angioendothelial compartments were negative for expression of this gene. Finally, the gene expression of fusin and apolipoprotein C-1 was observed within leukocytic (inflammatory) cells infiltrating within the invasive carcinomas. Fusin gene expression was predominantly within small lymphocytes, whereas apolipoprotein C-1 gene expression was within macrophages infiltrating the tumor or within necrotic debris associated with DCIS. Osteonectin and IGFBP7 were expressed within two architectural compartments in all four carcinomas studied. Osteonectin gene expression was observed within angioendothelial cells and the juxtatumoral stroma in all four cases. IGFBP7 was predominantly expressed within the angioendothelium, although two of four cases also showed weak labeling of tumor epithelium. Thus, although the identification of IGFBP7 within the invasion-specific cluster of breast cancer can largely be attributed to endothelial expression, our results are fully consistent with the published reports of IGFBP7 gene expression within breast tumor epithelium (18 , 19) . Three genes (osteonectin, collagen 1α1, and IGFBP7) were specifically expressed within the invasive tumor as compared with adjacent normal breast tissue and serve as markers of the desmoplastic response in infiltrating breast carcinomas (20 , 21) . Osteonectin is a phosphorylated, acidic, glycine-rich glycoprotein of Mr 43,000 with multiple Ca2+-binding domains. The function of osteonectin is not primarily known, but it is thought to be involved in angiogenesis and remodeling of the extracellular matrix in keeping with its elevated expression in the host stromal response (22 , 23) . Collagen 1α1 expression by the stroma likely reflects the transcriptional activity of proliferating fibroblastic tissue within the host response. IGFBP7, which was strongly expressed by endothelial cells, has not been described as an endothelial-specific marker in human tumors, although other members of this gene family have been so implicated (13) . For each invasive carcinoma analyzed, samples of normal breast terminal duct lobular unit epithelium were present within the same tissue section. Expression of three genes, fusin, apolipoprotein C-1, and laminin receptor 1, were also noted in tissues of the normal breast. Fusin expression was noted within small lymphocytes within the intralobular stroma of normal lobules, and apolipoprotein C-1 was expressed in macrophages present in benign ducts. Laminin receptor 1, although most strongly expressed by the neoplastic epithelium, was also weakly expressed in atrophic ducts, as well as in areas of DCIS both within and outside the mass.
Invasion-specific genes were thus spatially localized to distinct compartments of gene expression in the host stromal response to breast cancer. This extends our prior observations in the pancreas and supports the existence of a highly structured organization of gene expression within the host desmoplastic response to infiltrating carcinoma. Specifically, the finding of osteonectin gene expression localized to the juxtatumoral stroma validates this newly defined architectural region of the host stromal transcriptional response. Expression of osteonectin by the juxtatumoral stroma may thus be intimately involved with the invasive process and highlights this region as a potential site of tumor-host interactions to be targeted for therapeutic intervention.
Comparison of Juxtatumoral Gene Expression in Breast and Pancreas Carcinomas.
We have noted previously that apolipoprotein C-1, apolipoprotein D, and MMP11 are each gene expression markers of the juxtatumoral stromal compartment in adenocarcinomas of the pancreas (2) . In an effort to better discern the gene expression patterns of this distinct region of the host stromal response, we performed in situ hybridization of each of these 3 genes in the four samples of invasive breast carcinoma, with comparison to the gene expression patterns seen previously in samples of paraffin-embedded pancreas cancers for these genes (Fig. 4) ⇓ .
Gene expression of the juxtatumoral stroma was found to differ among these two tumor types. As noted previously, apolipoprotein C-1 was one of only 3 genes common to the invasion-specific clusters of both the breast and pancreas. Surprisingly, although this gene was expressed in both tumor types, its cellular distribution within breast or pancreas cancers was dissimilar. Apolipoprotein C-1 gene expression localized to tumor-infiltrating macrophages in the four samples of breast carcinoma (Fig. 4A) ⇓ but was clearly expressed by stromal fibroblasts within the juxtatumoral stroma in the four samples of pancreatic carcinoma (Fig. 4B) ⇓ . Apolipoprotein D expression also differed among breast and pancreas cancers. Apolipoprotein D gene expression was localized to tumor epithelium in the four breast carcinomas (Fig. 4C) ⇓ , in contrast to the juxtatumoral stromal pattern seen in all four pancreas cancer tumor tissues (Fig. 4D) ⇓ . Only MMP11, of the 3 invasion-specific genes studied, was found to localize to the juxtatumoral stroma in all four breast cancers and in all four pancreatic cancers studied (Fig. 4, E and F) ⇓ . Thus, although the juxtatumoral stroma appears to be a defined component of the host stromal response, the gene expression profile of this compartment must depend in part upon the site of tumor origin.
Surprisingly, although these genes are associated with the process of tissue invasion in both breast and pancreas cancers, their role in tissue invasion differs between these two tumor types. These observations in turn raise several other questions regarding the desmoplastic response to human tumors, i.e., is the gene expression of the host stromal response to primary tumors similar to or different from the stromal response present in metastatic tumors? Do histologically different tumors that are derived from the same organ type produce similar or different host stromal responses? Our current data indicate that, with respect to the desmoplastic response, the robust patterns of gene expression in one tumor type are different from the robust patterns seen in other tumor types. Clearly, additional work needs to be done to determine how static or variable these gene expression patterns are.
In summary, the patterns of spatially organized compartments of gene expression in the host response to breast cancer and the comparisons among various cancer types provide new insights into the biology of desmoplasia. These similarities in the host stromal response to different tumor types may suggest some universal targets for therapeutic intervention. Additional studies to understand the desmoplastic response to invasive neoplasms may aid in identifying new targets for clinical imaging, serological diagnosis, drug development, and delivery.
We thank Dr. Sandra Rempel for the generous gift of the osteonectin cDNA used for preparation of the osteonectin-specific riboprobes for in situ hybridization.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
↵1 Supported by the NIH Specialized Programs of Research Excellence in Gastrointestinal Cancer Grant CA 62924 (to S. E. K.) and the NIH Specialized Programs of Research Excellence in Breast Cancer Grant CA88843.
↵2 To whom requests for reprints should be addressed, at Department of Oncology, Room 461, Cancer Research Building, 1650 Orleans Street, The Johns Hopkins University School of Medicine, Baltimore, MD 21231. Phone: (410) 614-3316; Fax: (410) 614-9705; E-mail:
↵3 The abbreviations used are: SAGE, serial analysis of gene expression; NCBI, National Center for Biotechnology Information; PCA, principal component analysis; EST, expressed sequence tag; DCIS, ductal carcinoma in situ; IGFBP, insulin-like growth factor binding protein.
↵4 Internet address: http://www.ncbi.nlm.nih.gov/SAGE/.
↵5 Internet address: http://www.microarrays.org/software.html.
↵6 Internet address: http://www.ncbi.nlm.nih.gov/SAGE/SAGEtag.cgi.
- Received April 17, 2002.
- Accepted July 19, 2002.
- ©2002 American Association for Cancer Research.