There is considerable evidence that the presence of cancer can elicit a humoral immune response to specific proteins in the host, and these resulting autoantibodies may have potential as noninvasive biomarkers. To characterize the autoantibody repertoire present in the sera of patients with lung adenocarcinoma, we developed a high-density peptide microarray derived from biopanning a lung cancer phage display library. Using a 2,304-element microarray, we interrogated a total of 250 sera from Michigan lung cancer patients and noncancer controls to develop an “autoantibody profile” of lung adenocarcinoma. A set of 22 discriminating peptides derived from a training set of 125 serum samples from lung adenocarcinoma patients and control subjects was found to predict cancer status with 85% sensitivity and 86% specificity in an independent test set of 125 sera. Sequencing of the immunoreactive phage-peptide clones identified candidate humoral immune response targets in lung adenocarcinoma, including ubiquilin 1, a protein that regulates the degradation of several ubiquitin-dependent proteasome substrates. An independent validation set of 122 serum samples from Pittsburgh was examined using two overlapping clones of ubiquilin 1 that showed 0.79 and 0.74 of the area under the receiver operating characteristics curve, respectively. Significantly increased levels of both ubiquilin 1 mRNA and protein, as well as reduced levels of the phosphorylated form of this protein, were detected in lung tumors. Immunofluorescence using anti–ubiquilin 1 antibodies confirmed intracellular expression within tumors cells. These studies indicate that autoantibody profiles, as well as individual candidates, may be useful for the noninvasive detection of lung adenocarcinoma. [Cancer Res 2007;67(7):3461–7]
- Lung cancer
- Tumor markers and detection of metastasis
- Immune responses to cancer
Lung cancer is the leading cause of cancer-related deaths for both men and women in industrialized countries. The current 5-year overall survival rate for patients is 15.2%, which has improved only marginally in the past decade. Although patients diagnosed early with localized stage have a higher 5-year survival rate (48.8%) compared with those with late/distant stage disease (3.3%; ref. 1), nearly 35% of stage I patients will relapse after surgical resection, thus portending a poor prognosis ( 2). Detection of early-stage lung cancer and the identification of particularly high-risk patients would allow the opportunity to provide adjuvant therapy and possibly increase survival.
The current methods for the diagnosis of lung cancer require a biopsy and pathologic examination of the tissue usually after discovery of the lesion on chest X-ray or computerized tomography. There is currently no blood test available for lung cancer. A number of groups, including our own have characterized mRNA expression profiles of lung cancer ( 3– 5). In addition to transcript levels, lung tumors have also been profiled using comparative genomic hybridization and proteomic approaches to analyze DNA and protein alterations, respectively ( 6– 10). Whereas these approaches may be useful for molecular subtyping of resected or biopsied tumors, noninvasive methods to detect these lesions would provide substantial clinical values. Several groups are using proteomic approaches using serum to identify biomarkers for the early detection of lung cancer ( 11– 14). This is a daunting challenge, as this requires identification of relatively low abundant proteins in a complex mixture of highly abundant serum proteins, such as albumin ( 15).
One approach to circumvent the need to detect low abundant cancer biomarkers is to take advantage of the body's endogenous immune response to the tumor. There is considerable evidence that the immune system produces an autoantibody response to neoplastic cells ( 16– 18). The detection of such autoantibodies has been shown to have diagnostic and prognostic value ( 16, 17, 19– 21). For example, somatic alterations in the p53 gene elicit a humoral response in 30% to 40% of patients affected with various types of cancers ( 22). Our group has found that autoantibodies are generated against annexin I/II ( 21) and α-methylacyl-CoA racemase ( 23) in lung cancer and prostate cancer patients, respectively. Interestingly, a recent study suggests that B cells and their associated antibodies promote de novo carcinogenesis, suggesting that the humoral immune response may play a direct role in cancer progression ( 24, 25).
Using approaches, such as serologic analysis of recombinant cDNA expression libraries of human tumors with autologous serum (SEREX), it has been shown that the humoral immune response of cancer patients can be used to isolate novel tumor antigens ( 16). Although the SEREX technique is elegant, it relies upon a one-step screening technique without affinity selection steps and requires a large volume of sera to screen phage clones blotted onto membrane filters. This approach has limited clinical utility as patient sera are usually available in small quantities. Furthermore, the SEREX approach is not conducive to the study of sera from hundreds of patients.
Although there are many examples of gene expression profiles that can be used for molecular classification of cancer ( 3– 5, 26– 28), a global perspective to analyze and identify autoantibody repertoires in response to tumor antigens has only recently been developed ( 29– 31). In this study, we combine phage display technology with protein microarrays to develop a powerful platform to identify and characterize an autoantibody signature for lung adenocarcinoma patients that can be evaluated, in multiplex, to develop diagnostic biomarkers. In contrast to other strategies, display of peptides on the surface of phage particles is well suited for the enrichment of serum antibody-binding ligands through iterative, affinity steps ( 32). This emerging area, termed “cancer immunomics” ( 14), represents the global analysis of the host humoral immune response to neoplastic transformation.
Materials and Methods
Patient population and samples. This study was approved by the Institutional Review Boards of the University of Michigan Medical School and the University of Pittsburgh. Sera from 150 lung adenocarcinomas were collected at the time of surgery from January 1995 to January 2003 at the University of Michigan Hospital. All the primary tumor sections were evaluated by a study pathologist, and clinical information was collected (Supplementary Tables S1-S3). All patient identifiers were coded to protect confidentiality. As noncancer control subjects, 100 serum samples with no known history of cancer were collected from the University of Michigan Clinical Pathology laboratories (Supplementary Table S3). These samples were collected between 2001 and 2004 in three independent collection periods. No patients in this cohort received chemical or radiation treatment before the sera were collected. All sera are stored in aliquots at −80°C until use.
An independent cohort of sera, including 62 lung adenocarcinomas and 60 controls (Supplementary Table S4), matched for both age and smoking status and collected between 2000 and 2005 was provided by the University of Pittsburgh Cancer Institute/Hillman Cancer Center.
Autoantibody profiling. By iterative biopanning of a phage display library derived from lung cancer tissue pools, we constructed phage-peptide microarrays and used them to profile and define an autoantibody signature of lung adenocarcinoma. Details regarding the construction of phage display libraries and construction of the phage-peptide microarrays are described in the supplementary data.
Normalization and analysis of the microarray data. Slides were scanned and quantified using the GenePix 400B scanner (Axon Laboratories, Providence, RI). According to the experimental design, the median of Cy5-Cy3 ratio was used to control small variations in the amount of spotted phage epitope. The spots were treated as missing values if the Cy3 signal alone was 50% less than the average value across slides. Each slide was then scaled to have the same median across slides. Clones that have >20% missing values across slides were excluded from further analyses. The entire dataset was quantile normalized ( 33) and base 2 log transformed. The missing values were then imputated using Sequential KNN imputation method ( 34).
Statistical analysis. To determine whether autoantibody signatures can be used for the noninvasive detection of lung adenocarcinoma, we did class prediction using the BRB Array Tools software. 8 A “greedy pairs” method ( 35) was used to select “informative” feature clones for the predictors. Briefly, all phage-peptide clones were ranked based on their individual t scores on the training set, and the top-ranked clone xi was determined. Then the procedure searched for another clone, xj, that together with xi provided the best discrimination, using the distance between centroids of the two classes as a measure with regard to the two clones when projected to the diagonal linear discriminant axis. These two clones were then removed from the clone set, and the procedure was repeated on the remaining set until the specified number of pairs had been selected. This process was repeated for all training sets created during the leave-one-out cross-validation (LOOCV), and the k-nearest neighbor (k = 3) prediction was used to predict the left-out test sets during LOOCV. We tested the number of pairs from 2 to 20 in a stepwise fashion, and the desired number of pairs was selected to minimize the error rate of LOOCV. After the phage-peptide pairs were determined, we applied the predictor signature to an independent test set.
Supervised clustering analysis was done using Cluster and TreeView. 9 All other statistical analyses were done with R 10 or SPSS 11.5 (SPSS, Inc., Chicago, Illinois). The receiver operating characteristics (ROC) analysis was done to assess the sensitivity and specificity of the autoantibody profile for discriminating lung cancer patient sera from control sera in the test set and for each individual autoantibody. The ROC curves have been widely used to assess the accuracy of a diagnostic test that yields continuous test results in clinical research areas. Briefly, a ROC plot is obtained by calculating the sensitivity and specificity of every test result value and plotting sensitivity against 1 − specificity. A perfect diagnostic test would yield a “curve” that coincides with the left and top sides of the plot, and a test that is completely useless would give a straight line from the bottom-left corner to the top-right corner. As a summary statistic, the area under the ROC curve (AUC) and the associated p values are usually used to assess the performance of a test.
Meta-analysis of gene expression of humoral response targets. The gene expression level of ubiquilin 1 was studied using ONCOMINE 11 ( 36, 37). Briefly, ubiquilin 1 gene was queried in the database, and the results were filtered by selecting lung adenocarcinoma. The data from study classes of benign versus cancer were used for box plots. P values for each group were calculated using the Student's t test.
Two-dimensional PAGE and Western blot analysis. Analytic two-dimensional PAGE protein quantification was done as previously described ( 38). In this study, we selected two protein spots which represent native and phosphorylated forms of ubiquilin 1 on two-dimensional PAGE gels for further analysis. Protein separation and two-dimensional Western blotting were done as described previously ( 7). Individual membranes were incubated with mouse antihuman UBQLN1 antibody (Zymed Laboratories, Inc., Carlsbad, CA) at 1 μg/mL concentration. After additional washes, membranes were incubated with a secondary antibody conjugated to horseradish peroxidase (Amersham, Piscataway, NJ) at a 1:5,000 dilution for 1 h, then washed, and incubated for 1 min with enhanced chemiluminescence detection system (Amersham) and autoradiography.
Results and Discussion
Construction and analysis of the T7 phage-peptide microarrays. A schematic overview of our approach to identify autoantibody signatures of lung cancer is shown in Supplementary Fig. S1 and previous study ( 30). To develop a T7 phage display library for lung cancer, we isolated and then pooled total RNA from seven lung cancer tissues, each of which was composed of at least 80% tumor cells (Supplementary Table S1). Once packaged into the T7 phage system, peptides from the library were expressed as a fusion protein with the capsid 10B protein on the surface of the phage. This protein serves as a “bait” to capture autoantibodies present in serum. To enrich for T7 phage-peptides recognized by cancer or noncancer control sera, we did separate biopanning selections using 10 lung cancer and 3 noncancer control sera (Supplementary Table S2). Protein A/G beads, bound with antibodies from sera, were used to isolate phage-peptide particles that could bind these sera antibodies. The bound phage were eluted and amplified in bacteria, thus completing one round of biopanning (Supplementary Fig. S1). After four rounds of biopanning, phage particles expressing peptides that specifically elicit a humoral immune response in lung cancer patients or controls were enriched. A total of 2,304 phage-peptide clones were randomly selected from the biopanned phage libraries to generate phage-peptide microarrays. Once in a microarray format, these enriched phage-peptide clones can be used to interrogate serum samples for humoral immune response markers.
Using this 2.3K phage-peptide microarray, we evaluated sera from 150 lung adenocarcinoma patients and 100 noncancer control subjects (Supplementary Table S3). A two-color system was used, in which a green fluorescent dye (Cy3) was used to measure levels of the capsid 10B fusion protein spotted as a control and a red fluorescent dye (Cy5) was used to measure levels of bound IgG (Supplementary Fig. S1). Therefore, increased Cy5-Cy3 ratios represented varying levels of immune reactivity. Interestingly, most of the sera from lung adenocarcinoma patients exhibited antibody repertoires that display distinct reactivity relative to controls. Representative images of phage-peptide microarrays incubated with serum are depicted in Supplementary Fig. S2. The correlation coefficients of 20 replicate experiments ranged between 0.78 and 0.96, suggesting excellent reproducibility (Supplementary Fig. S3). After data normalization and imputation of missing values (see Materials and Methods for details), 2,304 clones were used for subsequent statistical analyses.
Autoantibody profiles for the diagnosis of lung adenocarcinoma. We next determined whether autoantibody signatures can be used for the noninvasive detection of lung adenocarcinoma. First, we divided 250 lung cancer patients and noncancer controls into a training set and a validation set with equal number of samples (composed of 75 cancer sera and 50 control sera in each set). The collection of cases and controls were separately matched based on age and sex; the training and validation samples were generated by randomly assigning one sample from the pair to each set (Supplementary Table S3). In the training set, the greedy pairs method ( 35) was adopted to select informative autoantibodies, and k-nearest neighbor analysis (k = 3) was used to build a class prediction model. We tested different autoantibody pairs, ranging from 2 to 20, in a stepwise fashion and observed that the top-ranked 22 autoantibodies (or 11 autoantibody pairs) had the best classification accuracy (85.6%, 107 of 125) in the training set according to LOOCV, with a sensitivity of 82.7% (62 of 75) and a specificity of 90.0% (46 of 50; Fig. 1A ; Table 1 ). These 22- autoantibodies were then used as a class predictor on an independent validation set, resulting in 85.3% (64 of 75) sensitivity and 86.0% (43 of 50) specificity ( Fig. 1B; Table 1).
To evaluate the performance of this 22-autoantibody signature on a continuous scale, we next used a compound covariate predictor approach to create an index score for each validation sample, as described previously ( 39, 40). Each sample's value for each of those 22 autoantibodies was multiplied by the corresponding coefficients derived from univariate logistic regressions on the training set with cancer/control as a binary response variable, and then the values were totaled. The created index scores were then assessed by the ROC curve, which provided a pure index of a test's accuracy by plotting the sensitivity against 1 − specificity for each result value of the test. Notably, the ROC analysis yielded AUC of 0.92 [P < 0.0001, 95% confidence interval (95% CI), 0.88–0.97] for the validation set ( Fig. 1C), demonstrating the strong discriminative power of this 22-antoantibody signature.
Humoral immune response targets and identification of ubiquilin 1. The phage-peptide microarray strategy facilitated identification of autoantibody targets by sequencing the respective phage cDNA clone. Supplementary Table S5 lists the identity of the peptide sequences of the 22 diagnosis-related humoral immune response targets which were represented in Fig. 1. Of these 22 diagnosis-related targets, peptides encoding ubiquilin 1 were found in nine independent phage-peptide cDNA clones based on the top 100 lung adenocarcinoma–associated phage-peptides sequence (data not shown). Seven immunoreactive phage-peptides clones of ubiquilin 1 spanned 112 amino acids (aa), from aa478 to aa589, and two clones spanned 125aa, from aa465 to aa589 ( Fig. 2A ). Both peptide stretches of ubiquilin 1 were the target of autoantibodies in lung adenocarcinoma patients relative to control subjects (P < 0.0001; Fig. 2B). For lung cancer diagnosis, a single autoantibody against phage-peptide clone encoding 112aa or 125aa of ubiquilin 1 exhibited AUCs of 0.84 (95% CI, 0.78–0.89) and 0.71 (95% CI, 0.65–0.77), respectively ( Fig. 2C).
Importantly, we then examined, using a more focused phage array with 1,129 clones, an independent, but clinically and demographically similar, case-control cohort of sera from the University of Pittsburgh. These included 62 lung adenocarcinomas and 60 controls (Supplementary Table S4) matched for both age and smoking status. The autoantibodies for both isoforms of ubiquilin 1 (112aa and 125aa) were also significantly higher in sera cases compared with controls (P < 0.0001; Supplementary Fig. S4), exhibiting AUCs of 0.79 (95% CI, 0.71–0.87) and 0.74 (95% CI, 0.65–0.83), respectively ( Fig. 2D).
Ubiquilin 1, also called PLIC, contains an ubiquitin-like domain (UBL) in the NH2 terminus and an ubiquitin-associated domain (UBA) in the C-terminal region, which are essential for its ability to inhibit the degradation of several ubiquitin-dependent proteasome substrates, including p53, IκB, and ã-aminobutyric acid (A) receptor ( 41, 42). Ubiquilin 1 is also involved in the proteasome-mediated degradation of various proteins, including presenilins, cyclin A, hepatitis C virus RNA–dependent RNA polymerase, and amyloid precursor proteins ( 43, 44). In addition, it has been suggested that splice variants of the ubiquilin 1 gene are associated with an increased risk of developing Alzheimer's disease ( 45).
Ubiquilin 1 mRNA and protein are increased in lung tumors. An independent gene expression profiling study of lung cancer showed that the mRNA for ubiquilin 1 was significantly increased in lung adenocarcinomas relative to normal lung (ref. 4; Fig. 3A ). To assess ubiquilin 1 protein levels, we did a Western blot analysis using an ubiquilin 1 specific antibody and nine pairs of lung tumor and associated normal lung tissue. Ubiquilin 1 protein levels were significantly higher in lung cancer compared with normal lung tissues ( Fig. 3B, C). Using the same antibody with two-dimensional Western blot analysis of lung adenocarcinoma tissues, we detected two isoforms (1, a native isoform; 2, phosphorylated isoform) of the ubiquilin 1 protein ( Fig. 3D). These two spots were matched to a compendium of two-dimensional PAGE gels ( 8) and quantified, showing that the unphosphorylated form was more abundant among the 93 lung adenocarcinomas compared with 10 normal lung tissues. Interestingly, the phosphorylated form of ubiquilin 1 was decreased in tumors with expression of an additional phosphorylated isoform (3, second phosphorylated isoform) of ubiquilin 1 exclusively present in normal lung ( Fig. 3D). To assess the cellular localization and expression of ubiquilin 1 antigen in situ, lung adenocarcinoma and normal lung tissues were examined using immunofluorescence ( Fig. 3E) and immunohistochemical analysis. Using both experimental approaches, strong cytoplasmic staining of ubiquilin 1 was observed in lung adenocarcinomas, and a weak cytoplasmic staining was found in type 1 and type 2 epithelial cells, as well as macrophages in normal lung tissues (Supplementary Fig. S5).
In addition to ubiquilin 1, we also found two independent, overlapping clones of heat shock 70 protein in the top 100 lung adenocarcinoma–associated phage-peptides sequence (data not shown). Autoantibodies against heat shock 70 protein were not selected in the 22 diagnosis-related autoantibody targets based on the supervised models we used (listed in Supplementary Table S5). Importantly, however, both phage-peptide clones encoding heat shock protein 70 showed identical increased immune response patterns in lung cancer relative to controls with AUC 0.75 (Supplementary Fig. S6). This protein has been previously reported to elicit an immune response in lung cancer patients ( 46, 47), thus providing support for our experimental strategy.
Most of the phage peptides identified in Supplementary Table S5 were either in untranslated regions of expressed genes or out of frame in the coding sequence of known genes. These peptides may be weakly homologous to known proteins or may have no distinct homology to the primary sequences of known proteins and thus may be “mimotopes” (i.e., stretches of amino acids that mimic an antigen but are not homologous at the sequence level; ref. 30).
In the present, we present a robust approach combining phage display with protein microarrays to detect lung cancer based on the endogenous humoral immune response signature. As this approach relies on a multiplex set of markers, it may be less likely to suffer from the drawbacks of monitoring any single biomarker ( 48). Our study has led to the detection of a number of novel peptide targets that elicit a humoral immune response in lung cancer patients. Interestingly, several of the peptides identified represent known proteins, including ubiquilin 1 and heat shock 70 protein. The potential role of these proteins in regulating tumor development and progression warrants further investigation. Ubiquilin 1 dysregulation in lung cancer is especially interesting, as this protein plays a role in the ubiquitination pathway which has been implicated in various cancer progression models ( 41, 49, 50).
In summary, our studies suggest that autoantibody signatures of lung cancer may have utility for the screening and early diagnosis of lung cancer due to the >80% sensitivity and specificity of the assay. As lung cancer lacks an accepted biomarker for screening, such as PSA for prostate cancer, this approach has the potential to have an effect clinically, as well as in the screening of high-risk populations. Unlike gene expression studies of tumor tissues, autoantibody profiling is done in serum, which can be much less invasively obtained and is easily monitored over time. Likewise, whereas there has been intensive activity in the use of proteomic approaches to identify biomarkers in sera ( 15), monitoring the immune response takes advantage of the inherent biological amplification provided by autoantibodies which can be more easily detected than low abundant proteins in a complex biological milieu, such as serum. Whereas our study suggests that the humoral immune response may be useful in the diagnosis and classification of tumors, it will also be important to investigate the role of these autoantibodies in promoting tumor development, especially in light of the growing evidence linking inflammation and cancer ( 24, 25).
Grant support: Burroughs Wellcome Foundation Clinical Translational Research award (A.M. Chinnaiyan); National Cancer Institute (NCI) Early Detection Research Network (EDRN) Biomarker Developmental Laboratory grants UO1 CA111275 (A.M. Chinnaiyan, D. Giacherio), EDRN 051717 (D.G. Beer), and R01GM72007-01 (D. Ghosh), Department of Defense Post-Doctoral Training grant W81XWH-04-1-0886 (X. Wang), Department of Defense grant PC060266 (J. Yu), and NCI Specialized Programs of Research Excellence in Lung Cancer grant P50 CA90440 (T. El-Hefnawy, W.L. Bigbee) at the University of Pittsburgh Cancer Institute/Hillman Cancer Center (Jill M. Siegfried, Ph.D., PI); and Cancer Center Bioinformatics Core grant 5 P30 CA46592.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Note: Supplementary data for this article are available at Cancer Research Online (http://cancerres.aacrjournals.org/).
G. Chen, X. Wang, and J. Yu contributed equally to this work.
- Received December 5, 2006.
- Revision received January 11, 2007.
- Accepted January 26, 2007.
- ©2007 American Association for Cancer Research.