Abstract
A better understanding of the molecular circuitry in normal ovarian tissues and in ovarian cancer will likely provide new targets for diagnosis and therapy. Recently, much has been learned about the genes expressed in ovarian cancer through studies with cDNA arrays and serial analysis of gene expression. However, these methods do not allow highly quantitative analysis of gene expression on a large number of specimens. Here, we have used quantitative real-time RT-PCR in a panel of 39 microdissected ovarian carcinomas of various subtypes to systematically analyze the expression of 13 genes, many of which were previously identified as up-regulated in a subset of ovarian cancers by serial analyses of gene expression. The genes analyzed are glutathione peroxidase 3 (GPX3), apolipoprotein J/clusterin, insulin-like growth factor-binding protein 2, epithelial cell adhesion molecule/GA733-2, Kop protease inhibitor, matrix gla protein, tissue inhibitor of metalloproteinase 3, folate receptor 1, S100A2, signal transducer and activator of transcription 1, secretory leukocyte protease inhibitor, apolipoprotein E, and ceruloplasmin. All of the genes were found overexpressed, some at extremely high levels, in the vast majority of ovarian carcinomas irrespective of the subtype. Interestingly, GPX3 was found at much higher levels in tumors with clear cell histology and may represent a biomarker for this subtype. Some of the genes studied here may thus represent targets for early detection ovarian cancer. The gene expression patterns were not associated with age at diagnosis, stage, or K-ras mutation status in ovarian cancer. We find that several genes are coordinately regulated in ovarian cancer, likely representing the fact that many genes are activated as part of common signaling pathways or that extensive cross-talk exists between several pathways in ovarian cancer. A statistical analysis shows that genes commonly up-regulated in ovarian cancer may result from the aberrant activation of a limited number of pathways, providing promising targets for novel therapeutic strategies.
Introduction
EOC 2 is the most lethal of gynecological malignancies, yet the molecular pathways involved in its pathogenesis are poorly understood. BRCA1 and BRCA2 have been implicated in familial cases but the significance of these genes in sporadic ovarian cancer is unclear (1, 2, 3) . In fact, few genes have been found mutated consistently in sporadic ovarian cancer. Mutations in p53 and K-ras have been reported at various frequencies depending on the cohort and the subtype (reviewed in Ref. 4 ). Attempts to implicate known oncogenes or tumor suppressor genes in EOC have been mostly unsuccessful (5, 6, 7) , which suggests that ovarian tumorigenesis may occur through novel or poorly characterized pathways. Recently, ARHI, a novel gene found by differential display PCR, was shown to frequently exhibit bi-allelic inactivation, making this gene an excellent tumor suppressor candidate in EOC (8 , 9) . Amplification of AKT2 (10 , 11) , ErbB2 (12 , 13) and Myc (14) has been reported to occur relatively frequently in ovarian cancer, but, again, these events do not explain the majority EOCs. Taken together, these results suggest that EOC tumorigenesis may occur through pathways favoring gross chromosomal aberrations and possibly extensive alterations in gene expression. Accordingly, using subtractive or candidate gene approaches, many genes have been reported differentially expressed in EOC such as LOT1 (15) , CyclinD1 (16) , FR (17) , DOC-1 and DOC-2 (18) , and mesothelin (19) . The functional significance of these genes in EOC is unclear.
Recent advances in the field of functional genomics have made it possible to study gene expression in EOC on a large scale. cDNA array technology has been used advantageously in the identification of numerous genes differentially expressed in EOC (20, 21, 22) . From these studies, many genes have emerged as promising biomarker candidates, including HE4, a secreted protease inhibitor. Using a specialized array, many angiogenesis genes were found differentially regulated in ovarian cancer (23) . In addition, we have used SAGE to identify genes differentially expressed in EOC (24) . Interestingly, several of the most up-regulated genes encode surface or secreted proteins, such as Kop, SLPI, claudin-3 and claudin-4, making these products attractive candidate biomarkers.
Although advancing our knowledge of genes expressed in EOC and generating a myriad of candidate biomarkers, functional genomics approaches have done little to improve our understanding of the molecular pathways involved in EOC. In addition, quantitative large-scale analysis of gene expression in primary tumors is technically challenging because of the requirement for relatively high amounts of intact RNA. In this report, we have chosen 13 highly relevant genes for quantitative analysis of expression in a panel of 39 microdissected ovarian cancers. Importantly, the genes under study were not chosen based on their importance in other cancer types but rather based on their relevance to EOC. We report the finding of many genes that are highly up-regulated in the vast majority of ovarian carcinomas. In addition, we show that the genes overexpressed in serous ovarian cancer are also overexpressed in other subtypes, making the genes studied here candidates as general EOC biomarkers. Finally, we find that many genes are up-regulated coordinately in ovarian cancer, which suggests the existence of a few dominant molecular pathways, whose abnormal regulation is responsible for the overexpression of many genes.
Materials and Methods
Cell Lines and Tissue Samples.
The SV40-immortalized ovarian surface epithelial cell line IOSE29 (25) was a gift from by Dr. Nelly Auersperg (University of British Columbia, Vancouver, British Columbia, Canada) and the SV40-immortalized cystadenoma line ML3 (26) was kindly provided by Dr. Louis Dubeau (University of Southern California, Los Angeles, CA). IOSE29 was cultivated in Medium 199 (Life Technologies, Inc., Gaithersburg, MD) supplemented with 5% newborn calf serum. ML3 was cultivated in MEM (Life Technologies, Inc.) supplemented with 10% fetal bovine serum and antibiotics as above. HOSE was a short-term culture of human ovarian surface epithelial cells grown in RPMI 1640 as described previously (24) . Two high-grade serous ovarian cancer specimens, OVT6 and OVT8, composed of at least 80% epithelial tumor cancer cells, as determined by histopathology, were chosen previously for a SAGE study (24) and were used here for RT-PCR. The ovarian tumors were frozen immediately after surgical removal from the patient and were obtained from the Johns Hopkins Hospital gynecological tumor bank. Thirty-nine snap-frozen ovarian carcinomas of various histological types were obtained from the University of Michigan Department of Pathology. These tumors were manually microdissected using H&E-stained frozen sections as dissection guides. Tumor cell-enriched specimens were composed of at least 70% tumor cells. The characteristics of these specimens are indicated in Table 1 ⇓ .
Characteristics of microdissected tumor specimens
Real-Time RT-PCR.
One μg of total RNA from each microdissected sample was used to generate cDNA using the Taqman reverse transcription reagents (PE Applied Biosystems, Foster City, CA). Similarly, cDNA was prepared using RNA from OVT6, OVT8, ML3, HOSE, and IOSE29 and were included for comparison. The SYBR Green I assay and the GeneAmp 5700 Sequence Detection system (PE Applied Biosystems) were used for detecting real-time PCR products from 2 μl of the reverse-transcribed RNA samples (from 200-μl total volume). Primers for 13 candidate genes and GAPDH as control were designed to cross intron-exon boundaries to distinguish PCR products generated from genomic versus cDNA template. The primer pairs used, as well as the sizes expected for each gene under study, are available from the authors1 on request. Each PCR reaction was optimized to ensure that a single band of the appropriate length (66–226 bp) was amplified and that no bands corresponding to genomic DNA amplification or primer-dimer pairs were present. The PCR cycling conditions were performed for all of the samples as follows: 2 min at 50°C for AmpErase UNG incubation; 10 min at 95°C for AmpliTaq Gold activation; and 40 cycles for the melting (95°C, 15 s) and annealing/extension (60°C for 1 min) steps. PCR reactions for each template were done in duplicate in one 96-well plate per gene-specific primer pair tested, except for the GAPDH control, which was done in quadruplicate in two 96-well plates. All of the experiments were optimized such that the threshold cycle (CT) from duplicate reactions did not span more than one cycle number.
The comparative CT method (PE Applied Biosystems) was used to determine relative quantitation of gene expression for each gene compared with the GAPDH control. First, the CT values from GAPDH reactions were averaged for each duplicate. Next, the relative difference between GAPDH and each duplicate was calculated (2 CT GAPDH − CT experimental). This value was then averaged for each duplicate set and divided by the value for HOSE, a short-term culture of ovarian surface epithelial cells, to determine the relative fold induction for each sample relative to these cells.
Sequencing of K-ras.
Mutations in the K-ras gene were identified by sequencing a 233-bp fragment encompassing codons 12, 13, and 61, which are frequently mutated in cancer. The fragment was obtained by PCR amplification of ∼5 ng of cDNA used for the real-time RT-PCR analyses. The PCR reactions were performed for all of the samples as follows: 35 cycles at 95°C for 30 s; 58°C for 1 min; and 70°C for 1 min, followed by a 5-min extension step at 72°C. The PCR amplification primers (forward, 5′-CCAGGTGCGGGAGAGAG-3′; reverse, 5′-CCCTCATTGCACTGTACTCC-3′) were also used for sequencing the gel-purified fragments.
Multivariate Analyses.
Analyses were performed using S-PLUS 2000 for PC (Mathsoft 2000, Seattle, WA). We computed a matrix of Pearson product moment correlations to measure the strength of bivariate associations between pairs of genes. We used F-ratios to assess the likelihood that each bivariate correlation was different from zero. We considered F-ratios with P < 0.001 significantly different from zero because this probability level minimized the chance of declaring spurious correlations. We hypothesized that genes in the same pathway would correlate more highly among themselves than with other genes. If this were so, we further hypothesized that, if we found more than one cluster of associated genes, then this would imply the presence of multiple pathways with some genes in the same pathway and other genes in different pathways. Additionally, genes contributing to more than one cluster of associations might reflect connections between pathways.
We used maximum-likelihood factor analysis (27) with Promax rotation to examine the matrix of correlations for clusters of associated genes. This technique provides a goodness-of-fit test to determine the number of “factors” that are regarded as hypothetical constructs with which the measures (gene expression) are related (28) . Factor analysis is a technique that reduces the relationships among many measures to a smaller number of (hypothetical) constructs. These constructs are linear combinations of the measured variables, but are considered representations of underlying or common factors. The goodness-of-fit test expresses the extent to which the derived linear combinations reproduce the measured relationships. The appropriate number of factors are calculated when a smaller number of derived factors adequately reproduce a larger number of measured variables.
The factors are interpreted by examining the weights from the linear combinations of measured variables. We interpret variables with large weights or “loadings” as sharing the same construct. Generally, the matrix of factor loadings is transformed to “simple structure,” the goal of which is to produce large associations between each measure and one (and only one) factor. We performed Promax rotation, a transformational technique that allows correlated factors if, indeed. the data do not support orthogonal factors (29) . In the present analyses, correlated factors might arise when some genes are involved in more than one pathway. A loading factor of 0.4 is typically accepted to be significant.
Results and Discussion
Real-Time RT-PCR Analysis.
To validate and extend our previous findings of genes differentially expressed in ovarian cancer (24) , we chose real-time RT-PCR, a highly sensitive and reproducible technique (30) . Because real-time RT-PCR does not require large amounts of starting RNA, we chose to analyze a panel of 39 microdissected primary EOCs of various histological subtypes. The use of microdissected specimens ensured a relatively pure population of tumor cells. We reasoned that this approach would allow an accurate determination of the frequency and extent of overexpression of many genes relevant to EOC. GPX3, clusterin/ApoJ, EpCAM/GA733-2, Kop, SLPI, ApoE, and ceruloplasmin all represented genes shown by SAGE to be consistently and highly up-regulated in a subset of ovarian cancer (24) . For this analysis, we also chose to include a number of genes that had previously been implicated in various human cancers and also exhibited up-regulation by SAGE but did not qualify as genuine up-regulated genes according to our strict set of criteria and, therefore, had not been reported as up-regulated (24) . These genes are TIMP-3, IGFBP-2, FR1, MGP, S100A2, and STAT1 (17 , 31, 32, 33, 34, 35, 36) .
The genes analyzed generally exhibited various levels of up-regulation in the majority of the microdissected ovarian tumors studied (Fig. 1) ⇓ ⇓ . Fold up-regulation compared with HOSE was highly variable. For example, ceruloplasmin was found at 10,000-fold above the levels in HOSE, whereas TIMP-2 was typically found elevated 2- to 3-fold. Interestingly, the induction levels as determined by RT-PCR were typically much higher than the figures obtained with SAGE. This may be attributable to two factors. First, the noncancerous components of the bulk tissue may have diluted the true fold induction. However, expression levels in RNA from microdissected specimens and SAGE tumors OVT6 and OVT8 were typically very similar (Fig. 1) ⇓ , which suggests that the dilution factor may not be a major contributor to the observed differences. Second, some of the genes that were found expressed at very low levels in normal tissues did not show a statistically significant number of tags in the SAGE analysis, which would bias the fold induction calculated using SAGE data. For example, the tag corresponding to ceruloplasmin was found 79 times in the tumors but was not found in any of the three normal samples (24) . Although this led to a calculated 79-fold increase, it is clear that the real difference could be much higher. Indeed, using real-time RT-PCR, we found that ceruloplasmin was up-regulated an average of 10,000-fold in serous samples and even higher in clear cell specimens (60,000-fold; Fig. 1 ⇓ ). FR1 also exhibited up-regulation levels in the thousands of fold. In any event, real-time RT-PCR validated the use of bulk tissue as starting material for SAGE analysis. Importantly, the quantitative RT-PCR analysis allowed us to examine expression of genes identified by SAGE in various ovarian cancer subtypes in a highly quantitative manner. All of the genes overexpressed in serous ovarian cancers were also overexpressed at various levels in the other ovarian cancer subtypes. Intriguingly, GPX3 appeared to be overexpressed at even higher levels in clear cell ovarian carcinomas. Indeed, GPX3 was found at levels 30-fold higher on average in clear cell cancer compared with the other ovarian cancer subtypes (Fig. 1) ⇓ . The high level of expression of GPX3 in clear cell ovarian cancer is particularly obvious when values are plotted on a linear scale (Fig. 1) ⇓ . Ceruloplasmin was also found at very high levels in at least two clear cell ovarian cancers. It is intriguing that GPX3 and ceruloplasmin, two genes implicated in oxidative stress response, are highly overexpressed in clear cell carcinomas, an EOC subtype notorious for its aggressiveness and poor prognosis. High levels of antioxidants likely result from high amounts of reactive oxygen species, which have been implicated in mitogenic signaling (37) and angiogenesis (38) . In addition, high levels of antioxidant proteins may make cells more resistant to chemotherapy (39) , which explains the poor response of clear cell ovarian cancer to treatment. These findings suggest that antioxidant inhibitors in combination with chemotherapy may improve response of clear cell ovarian cancers. In any event, GPX-3 may represent the first molecular marker that is highly specific for clear cell carcinomas.
Real-time RT-PCR analysis of selected genes in a panel of microdissected ovarian tumors. The X-axis is a log scale and represents the fold induction relative to HOSE expression. The Y-axis represents each sample tested. The first four sets of bars correspond to the microdissected specimen subtypes:
, serous; ▪, clear cell; checkered bars, endometrioid; □, mucinous. The last five bars correspond to samples used for SAGE analysis:
, OVT6 and OVT8 tumors;
, the controls (HOSE, IOSE29, and ML-3). Unless otherwise indicated, the graphs are plotted on a logarithmic scale.
Continued
Considering the high level of morphological heterogeneity in EOC, we did not expect to find genes elevated in all of the tumors observed. However, most of the genes showed extraordinary consistency between the different samples (Table 2) ⇓ . For example, all of the tumors exhibited at least a 10-fold overexpression of EpCAM/GA733-2, MGP, and ApoE. Except for IGFBP-2 and TIMP-3, all of the genes studied exhibited at least a 5-fold overexpression in 75% of the tumors or more. Interestingly, these three genes were removed from our final list of highly up-regulated transcripts in ovarian cancer because of a lack of consistency between the three tumor samples examined by SAGE (24) . When subdivided according to subtype, interesting patterns emerged (Table 2) ⇓ . For example, IGFBP-2 was never found elevated more than 10-fold in clear cell carcinomas and S100A2 was up-regulated at least 10-fold in all of the mucinous specimens but in only about 65% of the other subtypes.
Percentage of microdissected tumor specimens with 5-, 10-, and 100-fold gene expression differences
This report represents, to our knowledge, the first highly quantitative analysis of gene expression in ovarian cancers. Importantly, the genes analyzed were chosen on the basis of a large-scale study of gene expression and likely represent highly relevant targets in EOC. In addition, S100A2, STAT1, and MGP are shown here for the first time to be elevated in ovarian cancer. The genes analyzed here may represent, individually or in various combinations, novel markers for EOC diagnosis and treatment.
Association between the Expression of Various Genes.
We wondered whether overexpression of the 13 genes examined here were the result of the malfunction of numerous signaling pathways or a restricted number of pathways. It has been shown that expression of genes that are part of a common pathway tend to be coordinately regulated and that this behavior is apparent if a sufficient number of tumors are examined (40 , 41) . On the other hand, an absence of coordinate regulation would result in random levels of overexpression for the different genes in a given tumor. On multivariate analysis of the 44 samples, we found that several gene expression patterns showed significant association (P < 0.001; Fig. 2A ⇓ ). For example, whereas the expression patterns of clusterin/ApoJ and IGFBP-2 clearly did not exhibit any association (Fig. 2B) ⇓ , the expression patterns of STAT1 and Kop were very similar (Fig. 2C) ⇓ . Overall, we found 26 pairs of genes with a significant (P < 0.001) correlation coefficient. This result might be surprising considering that genes overexpressed in tumors are often assumed to be the result of the malfunction of a large number of pathways interacting in complex manners. However, it is important to remember that these genes were chosen because of high level of expression and consistency of up-regulation. Thus, there appears to be few pathways that meet these criteria, but these pathways may be highly relevant to ovarian oncogenesis.
Many genes are coordinately regulated in EOC. The strength of bivariate associations between pairs of genes was calculated using Pearson correlation coefficients. A, ▪, genes the expression levels of which demonstrate significant association (P < 0.001). Two representative graphs showing either (B) no correlation between expression patterns of the two genes clusterin (○) and IGFBP-2 (□) or (C) significant association between the two genes Kop (○) and STAT1 (□). The Y-axis represents gene expression levels (fold up-regulation compared with HOSE), and the points of the X-axis denote each microdissected ovarian tumor.
STAT1 expression was correlated with the expression of seven other genes, EpCAM/GA733-2, Kop, TIMP-3, FR1, SLPI, ApoE, and ceruloplasmin. STAT1 was recently found to be expressed as part of an IFN-regulated gene cluster in breast cancer (41) , but it is unclear whether the other genes coordinately regulated with STAT1 in our study are also part of this pathway. Moreover, STAT3 has been reported constitutively activated in ovarian cancer (42) . It is, thus, possible that activation of STATs is a major event in ovarian tumorigenesis. Similarly, TIMP-3 overexpression was associated with expression of six genes, STAT1, EpCAM/GA733-2, ApoE, SLPI, Kop, and FR1. Again, this suggests that these genes are all targets of a common signaling pathway and/or of pathways with extensive cross-talk. On the other hand, clusterin, IGFBP-2, MGP, and S100A2 did not correlate with any other expression patterns, which suggests that these genes belong to different, more restrictive molecular pathways. Although analysis of gene expression by RT-PCR does not allow knowledge of changes in protein activity, a hallmark of most molecular pathways, we believe our approach allows the identification of transcriptional targets of these pathways. It should thus be possible, through various molecular and biochemical techniques, to identify components of the pathways suggested by this study. To our knowledge, our study is the first to suggest several pathways relevant to ovarian cancer based purely on coordinately expressed genes.
To explore the possibility of the existence of multiple pathways leading to the expression patterns observed, we conducted further statistical analyses. We hypothesized that genes in the same pathway would correlate more highly among themselves than with other genes. If this were so, we further hypothesized that if we found more than one cluster of associated genes, then this would imply the presence of multiple pathways with some genes in the same pathway and other genes in different pathways. Additionally, genes contributing to more than one cluster of associations might reflect connections between pathways. We used maximum-likelihood factor analysis to examine the matrix of correlations for clusters of associated genes. This technique provides a goodness-of-fit test to determine the number of factors (pathways) that are regarded as hypothetical constructs with which the measures (gene expression) are related. We found that the gene expression patterns could be explained by the existence of four independent pathways (Table 3) ⇓ . Pathway 1 was associated with high expression levels of EpCAM/GA733-2, Kop, TIMP3, FR1 SLPI, STAT1, and ApoE. Pathway 2 was associated with high expression of ApoJ/clusterin, EpCAM/GA733-2, Kop, TIMP3, FR1 SLPI, STAT1, ApoE, and ceruloplasmin. There is significant overlap between pathways 1 and 2. Additional experiments will be necessary to determine whether these hypothetical constructs represent two different molecular pathways with extensive cross-talk or two aspects of the same pathway. Pathway 3 was associated with expression of GPX3, ApoJ/clusterin, SLPI, and ceruloplasmin. Interestingly, pathway 4 appeared to be associated with high levels of expression of S100A2 only, which suggests a more restricted expression pattern corresponding to the activation of this pathway. SLPI expression was associated with three different pathways, which suggests that it may represent a useful marker for ovarian cancer. This is consistent with the data showing high levels and consistent overexpression of SLPI in all of the ovarian tumor subtypes (Fig. 1 ⇓ ; Table 2 ⇓ ). Overall, our data suggest that the activation of a restricted number of pathways may underlie much of the aberrant gene expression profiles in ovarian cancer.
Predicted molecular pathways based on expression patternsa
Next, we wondered whether the different patterns of gene expression could be associated with a specific gene defect in ovarian cancer. Because K-ras mutations have been reported at varying frequencies in ovarian cancer, we sequenced K-ras in our panel and attempted to correlate the mutations with gene expression patterns. A total of six K-ras mutations were identified: 3 of 22 serous specimens and 3 of 4 mucinous specimens were mutated at codon 12 (Table 1) ⇓ . No K-ras mutations were identified in the clear cell or endometrioid ovarian carcinoma specimens. The mutant K-ras group of tumors was compared with the wild-type K-ras group, but no statistical differences were identified (data not shown). A larger panel of tumors will be necessary to determine whether K-ras mutations correspond to a specific pattern of gene expression in ovarian cancer. Finally, no specific pattern of gene expression could be significantly associated with age at diagnosis or with tumor grade (data not shown).
In this report, we have studied the expression of many genes up-regulated in EOC using real-time RT-PCR, a highly sensitive and reproducible technique. In addition, we have used microdissected specimens to maximize the proportion of tumor cells in the samples under study. The combination of real-time RT-PCR and microdissected specimens allowed a highly quantitative study of many genes in a relatively large number of ovarian tumors. We show that the genes that were suggested by SAGE to be relevant to EOC tumorigenesis are indeed highly elevated in the vast majority of ovarian carcinomas of various subtypes. Whereas most candidate biomarkers appear to be general markers of ovarian cancer, GPX3 appears to be specific for clear cell carcinoma. This represents the first systematic and highly quantitative study of gene expression in ovarian epithelial tumors of various subtypes. It is remarkable that several of the genes studied here are coordinately regulated in ovarian cancer. This finding suggests that a few pathways that are frequently activated in ovarian cancer are responsible for much of the aberrant gene expression observed consistently in EOC. Such an association is somewhat unexpected but suggests a certain amount of order in a disease that is known for its high level of heterogeneity. It will be interesting to identify the pathways up-regulated in ovarian cancer because they may provide new targets for effective therapeutic interventions.
Footnotes
-
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
-
↵1 To whom requests for reprints should be addressed, at Laboratory of Cellular and Molecular Biology, Gerontology Research Center, NIA, NIH, 5600 Nathan Shock Drive, Baltimore, MD 21224. E-mail: morinp{at}grc.nia.nih.gov Phone: (410) 558-8506; Fax: (410) 558-8386.
-
↵2 The abbreviations used are: EOC, epithelial ovarian cancer; SAGE, serial analysis of gene expression; HOSE, human ovarian surface epithelium; ApoJ, apolipoprotein J; GPX, glutathione peroxidase; SLPI, secretory leukocyte protease inhibitor; ApoE, apolipoprotein E; STAT, signal transducer and activator of transcription; GAPDH, glyceraldehyde-3 phosphate dehydrogenase; RT-PCR, reverse transcription PCR; TIMP, tissue inhibitor of metalloproteinase, IGFBP, insulin-like growth factor binding protein; FR, folate receptor; MGP, matrix gla protein.
- Received January 4, 2001.
- Accepted March 26, 2001.
- ©2001 American Association for Cancer Research.