Early detection remains the most promising approach to improve long-term survival of patients with ovarian cancer. In a five-center case-control study, serum proteomic expressions were analyzed on 153 patients with invasive epithelial ovarian cancer, 42 with other ovarian cancers, 166 with benign pelvic masses, and 142 healthy women. Data from patients with early stage ovarian cancer and healthy women at two centers were analyzed independently and the results cross-validated to discover potential biomarkers. The results were validated using the samples from two of the remaining centers. After protein identification, biomarkers for which an immunoassay was available were tested on samples from the fifth center, which included 41 healthy women, 41 patients with ovarian cancer, and 20 each with breast, colon, and prostate cancers. Three biomarkers were identified as follows: (a) apolipoprotein A1 (down-regulated in cancer); (b) a truncated form of transthyretin (down-regulated); and (c) a cleavage fragment of inter-α-trypsin inhibitor heavy chain H4 (up-regulated). In independent validation to detect early stage invasive epithelial ovarian cancer from healthy controls, the sensitivity of a multivariate model combining the three biomarkers and CA125 [74% (95% CI, 52–90%)] was higher than that of CA125 alone [65% (95% CI, 43–84%)] at a matched specificity of 97% (95% CI, 89–100%). When compared at a fixed sensitivity of 83% (95% CI, 61–95%), the specificity of the model [94% (95% CI, 85–98%)] was significantly better than that of CA125 alone [52% (95% CI, 39–65%)]. These biomarkers demonstrated the potential to improve the detection of early stage ovarian cancer.
Despite progress in cancer therapy, ovarian cancer mortality has remained virtually unchanged over the past two decades (1) . Annually in the United States alone, ∼23,000 women are diagnosed with the disease and almost 14,000 women die from it (1) . Given our knowledge about the steep survival gradient relative to the stage at which the disease is diagnosed, it is reasonable to suggest that early detection remains the most promising approach to improve the long-term survival of ovarian cancer patients.
The relatively low prevalence (40 out of 100,000) of ovarian cancer among postmenopausal women in the general population, the lack of a clearly defined precursor lesion, and the high cost and possible complications associated with surgical confirmatory procedures have placed stringent requirements on any test intended for general population screening. Currently, none of the existing serum markers, such as CA125, CA 72–4, or macrophage colony-stimulating factor, can be used individually for screening (2) . Longitudinal studies are under way in Europe, Japan, and the United States to evaluate screening strategies using CA125 and/or transvaginal sonography (3, 4, 5) and their impact on overall cancer mortality (6) . Preliminary results have shown encouraging evidence of a survival benefit among patients diagnosed through a screening regimen (3) .
Reports from retrospective studies have shown that multivariate predictive models combining existing tumor markers improve cancer detection (7 , 8) . Recent advances in genomic and proteomic profiling technology have made it possible to apply computational methods to detect changes in protein expressions and their association to disease conditions, thereby hastening the identification of novel markers that may contribute to multimarker combinations with better diagnostic performance (9, 10, 11, 12, 13) .
In this study, we hypothesized that comparison of protein expressions of serum specimens from patients with early stage ovarian cancer with those from healthy women could lead to the discovery of candidate biomarkers for the detection of early stage ovarian cancer. To ensure that the discovered biomarkers are truly associated with ovarian cancer rather than the result of biases in samples, profiling data of specimens from multiple institutions were used for cross-comparison and independent validation. We additionally determined the protein identities of the discovered biomarkers to allow for additional validation with independent methods and as a first step toward understanding the pathways in which they may function.
MATERIALS AND METHODS
The study involved a retrospective sample of 645 serum specimens. All were collected with institutional approval. Proteomic profiles were obtained from 503 specimens collected at four medical centers (M. D. Anderson Cancer Center, Duke University Medical Center, Groningen University Hospital, the Netherlands, and the Royal Hospital for Women, Australia). Among them, the cancer group consisted of 65 patients with stages I/II invasive epithelial ovarian cancer, 88 patients with stages III/IV invasive epithelial ovarian cancer, 28 patients with borderline tumors, and 14 patients with recurrent disease, all optimally staged by pathologists based on Fédération Internationale des Gynaecologistes et Obstetristes criteria. Among the stages I/II invasive cases, 20 were serous, 17 were mucinous, 15 were endometrioid, 8 were clear cell, 1 was carcinosarcoma, and 4 were mixed epithelial carcinoma. The samples also included 166 patients with benign pelvic masses and 142 healthy donors as controls. All of the samples were collected before the day of surgery or treatment, stored at −70°C, and thawed immediately before assay. CA125 levels had been obtained previously using a CA125II radioimmunoassay (Centocor). The clinical characteristics and age distribution of the proteomic profiling study population are summarized in Table 1 ⇓ .
In addition to the 503 specimens for proteomic profiling, 142 independent, archived serum specimens collected for routine clinical laboratory testing at the Johns Hopkins Medical Institutions were tested for levels of the identified biomarkers for which an immunoassay test was available. The sample included 41 healthy women, 41 patients with late-stage ovarian cancer, and groups of 20 patients each with breast, colon, and prostate cancers. All of the samples were processed promptly after collection and stored at 2–8°C for a maximum of 48 h before freezing at −70°C. CA125II assay was performed using a twosite immunoenzymometric assay on the Tosoh AIA-600 II analyzer (Tosoh Medics).
Proteomic Expression Profiling.
The ProteinChip Biomarker System (Ciphergen Biosystems), a platform for surface-enhanced laser desorption/ionization time-of-flight mass spectrometry, was used for protein expression profiling (14 , 15) . Samples from all of the centers were processed identically. To increase the total number of detectable peaks, serum samples (20 μl) were first fractionated via anion exchange chromatography by stepwise pH gradient elution (taken at pH 9/flow through, 7, 5, 4, 3, and organic solvent) using a Biomek 2000 equipped with the ProteinChip Biomarker Integration Package (Ciphergen Biosystems). A control sample of pooled human serum specimen (Intergen) was used for data calibration between experiments. Aliquots of each fraction were bound in triplicate with a randomized chip/spot allocation scheme to IMAC3-Cu, SAX2, H50, and WCX2 ProteinChip arrays. The energy absorbing molecule (crystallization matrix) for surface-enhanced laser desorption/ionization time-of-flight mass spectrometry, saturated sinapinic acid dissolved in 50% acetonitrile/0.5% trifluoracetic acid, was applied promptly. The spotted arrays were read on PBS II ProteinChip readers. Instruments were monitored weekly for performance using insulin and immunoglobulin standards.
Spectra were externally calibrated, baseline subtracted, and normalized to total ion current within m/z (mass/charge) range of 1.5–150 kDa. Qualified mass peaks (signal/noise >5; cluster mass window at 0.3%) within the m/z range of 2–50 kDa were selected automatically. Logarithmic transformation was applied to the peak intensity before analysis for biomarker discovery. After biomarker discovery, the quality and intensity readings of the selected peaks were manually reconfirmed from raw spectra.
Multicenter Study Design.
The diagram in Fig. 1 ⇓ describes the design of this multicenter study and usage of samples for biomarker discovery, predictive model construction, and independent validation. To minimize the possibility of false discovery because of site-specific systematic biases from preanalytical variables, a key feature of the design is that for biomarker discovery, the stage I/II ovarian cancer samples and healthy controls from Duke University Medical Center and Groningen University Hospital were analyzed separately according to sites. Only those potential biomarkers that were deemed statistically informative and shared the same up or down dysregulation patterns in analyses of data from both sites were additionally validated using the remaining samples at these two sites and all of the samples at the Royal Hospital for Women and M. D. Anderson Cancer Center. Samples from Johns Hopkins Medical Institutions were used only for additional independent confirmation by immunoassays. Because of insufficient specimen volume, the immunoassays were not tested on the samples from the first four institutions.
The unified maximum separability analysis algorithm implemented in the software ProPeak (3Z Informatics) was used to analyze the peak intensity data and to select a subset of informative peaks. The unified maximum separability analysis incorporates information from the traditional multivariate statistical classification methods into the support vector machine algorithm (16) to provide a robust approach to analyzing high-dimensional expression data (12 , 17, 18, 19) . Bootstrap was also applied to help to identify peaks with a consistently high discriminatory power over multiple resampled subpopulations.
The discovered biomarkers were purified according to individual biochemical properties using a series of protein separation procedures including anion exchange, size exclusion, and reverse-phase chromatography followed by SDS-PAGE separation. To monitor the purification process, healthy control samples were processed in parallel with the ovarian cancer samples. During each of the iterations, the new fractions were profiled on the same type of ProteinChip arrays used for discovery to monitor the presence or absence of the biomarkers of interest. The purified biomarkers were excised and digested with trypsin before being spotted onto NP20 ProteinChip arrays and read in a PBSII ProteinChip reader. The masses of the proteolytic fragments were used for database searching with the ProFound algorithm. For confirmation, the NP20 arrays containing the proteolytic fragments were analyzed by collision-induced dissociation using a Q-STAR MS/MS instrument (Applied Biosystems/MDS Sciex) equipped with a PCI 1000 ProteinChip Interface (Ciphergen Biosystems).
Multivariate Predictive Models.
The profiling data from the two sites used for biomarker discovery were combined and then divided randomly into a training set and a test set for the derivation and testing of nonlinear multivariate predictive models using nonlinear unified maximum separability analysis. The derived models were then additionally evaluated, in the same way as the individual biomarkers, on the independent validation data that were not involved in biomarker discovery or model construction.
The performance of the identified biomarkers in detecting early stage ovarian cancer from healthy controls was evaluated by descriptive statistics, Mann-Whitney U test or Kruskal-Wallis test, and receiver operating characteristic curves (ROCKIT; University of Chicago, Chicago, IL; Ref. 20 ). Performance of the nonlinear multivariate predictive models was compared with performance of the CA125 assay alone by receiver operating characteristic analysis and estimated sensitivity and specificity. All of the statistical analyses were performed using Statistica 6.1 (Statsoft).
Biomarker Discovery and Identification.
Three potential biomarkers were discovered. Two of them were peaks from fraction pH 4 at m/z 12,828 and 28,043, both down-regulated in the cancer group, and the third was from fraction pH 9/flow through at m/z 3,272, up-regulated in the cancer group. All bound to the IMAC3-Cu (immobilized metal affinity chromatography array charged with copper ions) ProteinChip array (representative spectra in Fig. 2 ⇓ ).
The m/z 28,043 peak was purified and identified as apolipoprotein A1 (z score = 2.38 using the ProFound algorithm; 59% coverage). Three peptides were sequenced by tandem mass spectrometry and confirmed this finding. The m/z 12,828 peak was purified and identified as a form of transthyretin (prealbumin; z-score 2.19; 91% coverage). One of the peptides was sequenced by tandem mass spectrometry and confirmed this finding. The m/z 12,828 peak copurified with a much more abundant m/z 13,900 peak, which was also identified as transthyretin. Immunoprecipitation and tandem mass spectrometry additionally determined that a truncated form of transthyretin lacking the NH2-terminal 10 amino acids corresponded to the m/z 12,828 peak. The m/z 3,272 peak was purified from the pooled serum of ovarian cancer patients and its sequence was determined to be MNFRPGVLSSRQLGLPGPPDVPDHAAYHPF, a fragment spanning amino acids 660–689 of human inter-α trypsin inhibitor, heavy chain H4 (PK-120). This result was confirmed by the analysis of pepsin digestion products of the marker.
Discriminatory Power of Individual Biomarkers.
Table 2 ⇓ provides the descriptive statistics (mean ± SD, median) and results from statistical tests of these three biomarkers between healthy controls and ovarian cancer patients of different stage groups in the combined discovery sets and in the independent validation set. Within the discovery sets, the expression levels of the three biomarkers were statistically significantly different between the healthy controls and the early stage ovarian cancer patients (P < 0.000001 for all of the three markers). With the independent validation set, the two biomarkers at m/z 12,828 and m/z 28,043 retained their statistically discriminatory power in detecting stage I/II ovarian cancer (P < 0.000001 and P < 0.000008, respectively). However, the biomarker at m/z 3,272, with its large within-group variances, had only a marginal effect in separating healthy controls from epithelial ovarian cancer. In Table 3 ⇓ , the distributions of the biomarkers are compared among subgroups of the cancer samples based on stages, histological subtypes, and age.
Fig. 3, A–D ⇓ , compares the discriminatory power of the individual biomarkers with that of CA125, using receiver operating characteristic analysis on data from patients with early stage ovarian cancer and healthy controls. CA125 and m/z 12,828 performed comparably on both the discovery and independent validation sets, whereas the other two markers had a lower area-under-curve than CA125 in one or both data sets. However, the estimated correlations among the three biomarkers and CA125 were low (data not shown), suggesting the possibility that they were complementary to each other and that a multivariate approach might outperform the single assay of CA125.
Because 27% of the samples in the healthy controls were from women age 50 or older compared with 61% of those in the early stage ovarian cancer group, we were concerned that these markers might reflect age-related changes. However, the differences between the early stage cancer samples and the healthy controls in the discovery data sets remained statistically significant in a multiple regression after adjusting for difference in age (P = 0.000001 for m/z 12,828 and m/z 3,272; and P = 0.0174 for m/z 28,043). In a stratified analysis, there were no appreciable differences in biomarker values between cancer patients of age ≥50 years and those below 50 years (Table 3) ⇓ . Previous population-based studies have shown that levels of apolipoprotein A1 actually increase slightly with age (21 , 22) .
Multivariate Predictive Models.
The two data sets used for biomarker discovery were merged and then randomly divided into a training set and a test set. The training set had 28 ovarian cancer cases and 33 healthy controls, whereas the test set consisted of 29 cancer cases and 46 controls. The training and test data sets retained only the peak intensity values of the three discovered biomarkers and CA125 test results.
Two multivariate predictive models were constructed. The first used only the three biomarkers as its input and the second used the three biomarkers along with the CA125 level. Panels E–H in Fig. 3 ⇓ compare the overall diagnostic performance of the two models with that of CA125 using receiver operating characteristic analysis, among which are the superimposed views of both the empirical and fitted receiver operating characteristic curves estimated using the healthy control samples against all of the stage I/II epithelial ovarian cancer samples (Fig. 3G) ⇓ or the invasive stage I/II epithelial ovarian cancer samples (Fig. 3H) ⇓ in the independent validation data set.
Using cutoffs that maximized the sum of sensitivity and specificity on the training data, these models were applied to the test data and the independent validation data (Table 4) ⇓ . For discrimination between healthy controls and stages I/II invasive ovarian cancer in the independent validation set, the multivariate model using the three biomarkers and CA125, at a sensitivity of 83% (95% CI, 61–95%), had a specificity of 94% (95% CI, 85–98%). This is statistically significantly better than CA125 at the same sensitivity of 83% (with a cutoff of 11 units/ml), which yielded a specificity of only 52% (95% CI, 39–65%). On the other hand, CA125 at the cutoff of 35 units/ml had a specificity of 97% (95% CI, 89–100%) and a corresponding sensitivity of 65% (95% CI, 43–84%). At the same fixed specificity, the multivariate model using the three biomarkers and CA125 resulted in a sensitivity of 74% (95% CI, 52–90%). The difference, however, is not statistically significant, partially because of the few stage I/II invasive cases in the independent validation set. Table 4 ⇓ also lists in detail the estimated sensitivities or specificities for individual diagnostic groups in the training set, test set, and independent validation set.
It should be noted that the biomarkers, with the exception of m/z 3,272, as well as the two predictive models were moderately capable of differentiating stages I/II invasive cancer from benign cases in the independent validation set (P = 0.002 and 0.09 for m/z 12,828 and 28,043, respectively; and P = 0.02 and 0.03 for models without CA125 and with CA125, respectively).
Evaluation Using Immunoassays.
The 142 archived specimens from the Johns Hopkins Medical Institutions were analyzed for apolipoprotein A1 using a turbidimetric immunoassay performed in a microtiter plate format (Wako Chemical USA) and for transthyretin using a particle enhanced turbidimetric immunoassay performed on the Dimension RxL Instrument (Dade-Behring; Table 5 ⇓ ). The serum levels of CA125 were up-regulated among the 41 patients with late-stage ovarian cancer compared with the 41 healthy controls (mean ± SD in units/ml, 2388 ± 4723 versus 18 ± 23, P = 0.000000), whereas levels of apolipoprotein A1 and transthyretin were down-regulated (mean ± SD in mg/dl, 122 ± 42 versus 153 ± 28, P = 0.0004, and 20 ± 7 versus 27 ± 6, P = 0.00005, respectively). The mean serum apolipoprotein A1 level among the healthy controls was not significantly different from that of patients with breast or colorectal cancer (P = 0.77 and P = 0.69, respectively) and only marginally different from that of patients with prostate cancer (P = 0.02). The mean serum transthyretin level was down-regulated among patients with colorectal cancer (P = 0.01) albeit to a lesser degree than that in patients with ovarian cancer. The differences in mean serum transthyretin levels between the healthy controls and patients with breast or prostate cancer were not significant (P = 0.51 and P = 0.22, respectively).
Differential analysis of serum protein profiles from patients with early stage ovarian cancer and healthy women revealed three biomarkers the discriminatory power of which was confirmed with samples from multiple institutions through cross-validation and independent validation. Combined with CA125 in a multivariate predictive model, these biomarkers improve significantly on the specificity of CA125 alone although maintaining a relatively high sensitivity. Two of the markers were also evaluated using immunoassays. Although the immunoassay for transthyretin was not specific for the particular truncated form that corresponds to the m/z 12,828 peak, the results corroborated the findings from the surface-enhanced laser desorption/ionization mass spectrum data and provided preliminary analysis of tumor site specificity of these two markers.
Results from receiver operating characteristic curve analysis (Fig. 3, G and H) ⇓ show that at a fixed high sensitivity, the predictive models, with or without CA125 as one of its inputs, had a much-improved specificity over that of CA125 alone. However, when the models were compared with CA125 at a fixed high specificity, the improvement in sensitivity was more moderate and tended to diminish as specificity approached 98% and above. The level of analytical variability in peak intensity measurement relative to the few early stage ovarian cancer cases in the independent validation set made it difficult to evaluate in a statistically meaningful way the diagnostic performance of the individual biomarkers or the multivariate models at extreme specificity or sensitivity values.
The differences in area-under-curve in detecting stage I/II invasive cancer from healthy controls were not statistically significant for the independent validation set. Given the fairly large absolute differences in area-under-curve, a likely reason would be the relatively few stage I/II invasive cases available.
Serum levels of CA125 have been used widely for distinguishing benign from malignant pelvic masses and for monitoring the clinical course of patients with ovarian cancer. However, CA125 is elevated in only about half of stage I/II ovarian cancer patients (23) . Lowering the cutoff of CA125 would increase its sensitivity in detecting stage I/II cancer but result in many false positives in patients with benign conditions as well as in healthy women. The improvement in sensitivity by the multivariate models to detect early stage cancer also increased the false-positive rate among patients with benign pelvic masses, but unlike with CA125, it did not lead to appreciable concomitant loss of specificity among healthy women. At a sensitivity of 83% and specificity of 94%, the current multivariate model by itself is not suitable for general population screening. However, in a two-stage approach with a combination of a serum test followed by sonography, a high specificity might be reached at an acceptable total cost.
The identified biomarkers have been characterized generally as acute-phase reactants. The down-regulation of transthyretin and apolipoprotein A1 in ovarian cancer patients suggests the possibility that they are byproducts of the host response to the tumor. Although recent data have supported the concept that inflammation is a critical component of tumor progression (24) , such acute-phase reactants may still represent only epiphenomena because of the presence of tumor and may not be specific to a particular type of cancer (25 , 26) . In this study, with preliminary data from immunoassays analysis, we were able to verify that the levels of apolipoprotein A1 were not altered in the serum of breast or colon cancer patients and the levels of transthyretin were not altered in the serum of breast or prostate cancer patients.
Transthyretin and lipoprotein(a) have been separately reported to be decreased in epithelial ovarian cancer (27 , 28) . Transthyretin is the major carrier for serum thyroxine and tri-iodothyronine and facilitates the transport of retinol via its interaction with retinol binding protein. Transgenic mice lacking transthyretin expression have dramatically lower levels of retinol and retinol binding protein, and decreased levels of retinol binding protein as well as cellular retinol binding protein have been shown to be associated with an increased rate of malignant transformation of ovarian epithelium (29 , 30) . In addition, levels of cellular retinal binding proteins have been reported to be changed in ovarian cancer by oligonucleotide array analysis (31) and have been shown to be decreased in approximately one-third of ovarian cancers by immunohistochemistry (32) .
The carboxyl portion of inter-α trypsin inhibitor, heavy chain H4, from which the m/z 3,272 biomarker is derived, has been shown to be a substrate for plasma kallikrein (33 , 34) . The fragment that we identified as a biomarker differs from the postulated fragment derived from cleavage by plasma kallikrein, suggesting a different protease is responsible for the generation of this biomarker. Kallikrein proteases consist of plasma kallikrein and tissue kallikreins, which have overlapping substrate specificity (35) . The tissue kallikreins are products of a large multigene family that includes prostate specific antigen (hK3), a tumor marker for prostate cancer. Gene expression of seven of the tissue kallikreins has been found to be up-regulated in ovarian cancer patients, and expression of hK6, hK10, and hK11 has been found to be increased in >50% of patients with ovarian cancer (36 , 37) .
The identified proteins associated with the three biomarkers are all of high abundance in serum and are unlikely to be released by tumor cells. This limitation has been pointed out previously as being common among studies using mass spectrometry based proteomic profiling (25 , 26 , 38) . However, this is not purely a limitation of the mass spectrometry technological platforms, but rather reflects the limitation of high throughput sample preprocessing technologies that would allow us to examine subsets of the serum proteome without introducing additional significant analytical variability. The use of sample prefractionation in addition to multiple array surfaces in this study is a first attempt in surface-enhanced laser desorption/ionization mass spectrometry based clinical proteomics for biomarker discovery.
Two of the markers found in this study are cleavage products of precursor proteins and present in the serum at a fraction of the concentrations of their corresponding full-length proteins. Neither of these truncation products has been reported previously, indicating that they are new candidate markers. These markers may be the product of cleavage by one or more proteases, including plasma kallikrein, tissue kallikreins, matrix metalloproteases, or prostasin, a trypsin-like serine protease that was reported recently to be increased in cases of ovarian cancer (39) . The proteases that generate these markers may themselves act as markers that can be combined with those discovered here in a predictive model. More generally, these results support the increasing body of evidence indicating alterations in the balance of protease and protease inhibitor activity in the serum and tissue of patients with cancer (40 , 41) . Preliminary results from additional immunoprecipitation pull-down analysis suggest that the forms and relative frequencies of posttranslational modifications of inter-α trypsin inhibitor, heavy chain H4 and transthyretin may be associated with the ovarian cancer disease status (data not shown).
The use of proteomics to identify potential biomarkers has been explored previously for the detection of a number of cancers including prostate (13 , 42, 43, 44) , colon (45) , bladder (46) , breast (12) , and ovarian (9 , 17) . For example, Petricoin et al. reported the use of self-organizing map coupled with genetic algorithm to search through raw mass spectrum data for informative variables and to form clusters of training samples as the basis of a predictive model. More recently, Kozak et al. (47) reported several panels of selected surface-enhanced laser desorption/ionization peaks for the detection of ovarian cancer. A number of these peaks and their expression patterns are consistent with those from our current or previous results (17) .
Biomarker discovery using clinical proteomics involves the simultaneous analysis of expression levels of many proteins measured on a relatively few clinical samples. The difficulty in statistical analysis is complicated additionally by the possibility of nondisease-related biases and variability in data from preanalytical and analytical variables and within-group heterogeneity associated with clinical samples. A major concern has always been whether the discovered biomarkers and the derived multivariate models are truly associated with the disease process. Recent reports examined and highlighted the danger of such issues (26 , 38 , 48) . The current study had a number of features that were designed to alleviate the impact of these factors. First, only the intensities of detectable peaks were used in analysis, which in general are less sensitive to mass shift in raw spectrum data, and the discovered biomarkers are more likely to have an identifiable biological identity. Second, biomarker discovery and construction of predictive models were done in two separate stages. During the discovery stage, only linear models were used to evaluate the informativeness of individual peaks. The bootstrap resampling was used to additionally ensure that the peak selection results are robust among multiple subpopulations. It was only after a small group of biomarkers had been selected and individually validated across the multiple data sets that the more complicated nonlinear unified maximum separability analysis predictive models were derived to combine the multiple markers. It would have been difficult to explain and validate the exact roles of individual inputs if we had started the peak selection directly with a complex nonlinear model using raw spectrum data. Third, in this study, the data sets from multiple sites were used separately to cross-validate discoveries from other data sets and to detect systematic biases in data such as those because of differences in preanalytical sample processing procedures. This differs significantly from the approaches in which data from multiple sites are pooled together and then divided through randomization into (artificially made) identically distributed training and test data sets. Considering the high cost associated with post discovery validation, however, we believe that the conservative approach that we used is more close to the real clinical environment and appropriate for biomarker discovery through expression profiling. Finally, the identification of the discovered biomarkers allowed additional confirmation through immunoassay tests.
Previous studies have illustrated the benefits of combining multiple markers and the longitudinal use of tumor markers in the detection of ovarian cancers (8 , 49 , 50) . New biomarkers derived from proteomic analysis of clinical samples, once validated, may provide additional choices in the selection of an optimal panel of markers that, through a multivariate approach, would be capable of detecting early stage ovarian cancer in a more general population. Studies remain to be performed to evaluate these markers individually and in combination in larger populations and, most importantly, in preclinical samples from women with ovarian cancer obtained in screening trials. It should be pointed out, however, that intensity data from mass spectra measure the relative abundance of proteins. In this study, all of the surface-enhanced laser desorption/ionization data were generated under well-controlled conditions, with which we were able to demonstrate the performance of the combined three markers and their complementary value to CA125, using samples across multiple sites. The successful application of the multivariate predictive models to new clinical samples, however, will require additional work to assure the consistency of measured peak intensities between multiple runs and across different instruments and sites. Knowledge of the identities of the biomarkers will certainly help to expedite future development of assays.
Grant support: National Cancer Institute Grant 1P50 CA83639, UTMDACC Specialized Programs of Research Excellence in Ovarian Cancer (R. Bast, Jr., and Z. Zhang), and funding from Ciphergen Biosystems, Inc. (Z. Zhang, J. Li, A. Rai, J. Rosenzweig, B. Cameron, Y. Wang, and D. Chan).
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Requests for reprints: Zhen Zhang, Center for Biomarker Discovery, Johns Hopkins Medical Institutions, 419 N. Caroline Street, Room 200, Baltimore, MD 21231. E-mail:
- Received March 1, 2004.
- Revision received May 17, 2004.
- Accepted May 26, 2004.
- ©2004 American Association for Cancer Research.