Cancer Research Landon Prizes for Basic and Translational Cancer Research  Tumor Immunology: New Perspectives
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
Cancer Research Clinical Cancer Research
Cancer Epidemiology Biomarkers & Prevention Molecular Cancer Therapeutics
Molecular Cancer Research Cancer Prevention Research
Cancer Prevention Journals Portal Cancer Reviews Online
Annual Meeting Education Book Meeting Abstracts Online

This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Supplementary Data
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Lin, B.
Right arrow Articles by Hood, L.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Lin, B.
Right arrow Articles by Hood, L.
[Cancer Research 65, 3081-3091, April 15, 2005]
© 2005 American Association for Cancer Research


Molecular Biology, Pathobiology, and Genetics

Evidence for the Presence of Disease-Perturbed Networks in Prostate Cancer Cells by Genomic and Proteomic Analyses: A Systems Approach to Disease

Biaoyang Lin1, James T. White1, Wei Lu1, Tao Xie1, Angelita G. Utleg1, Xiaowei Yan1, Eugene C. Yi1, Paul Shannon1, Irina Khrebtukova3, Paul H. Lange2, David R. Goodlett1, Daixing Zhou3, Thomas J. Vasicek3 and Leroy Hood1

1 Institute for Systems Biology; 2 Department of Urology, University of Washington, Seattle, Washington; and 3 Lynx Therapeutics, Inc., Hayward, California

Requests for reprints: Biaoyang Lin, Institute for Systems Biology, 1441 North 34th Street, Seattle, WA 98103. Phone: 206-732-1297; Fax: 206-732-1299; E-mail: blin{at}systemsbiology.org.


    Abstract
 Top
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 References
 
Prostate cancer is initially responsive to androgen ablation therapy and progresses to androgen-unresponsive states that are refractory to treatment. The mechanism of this transition is unknown. A systems approach to disease begins with the quantitative delineation of the informational elements (mRNAs and proteins) in various disease states. We employed two recently developed high-throughput technologies, massively parallel signature sequencing (MPSS) and isotope-coded affinity tag, to gain a comprehensive picture of the changes in mRNA levels and more restricted analysis of protein levels, respectively, during the transition from androgen-dependent LNCaP (model for early-stage prostate cancer) to androgen-independent CL1 cells (model for late-stage prostate cancer). We sequenced >5 million MPSS signatures, obtained >142,000 tandem mass spectra, and built comprehensive MPSS and proteomic databases. The integrated mRNA and protein expression data revealed underlying functional differences between androgen-dependent and androgen-independent prostate cancer cells. The high sensitivity of MPSS enabled us to identify virtually all of the expressed transcripts and to quantify the changes in gene expression between these two cell states, including functionally important low-abundance mRNAs, such as those encoding transcription factors and signal transduction molecules. These data enable us to map the differences onto extant physiologic networks, creating perturbation networks that reflect prostate cancer progression. We found 37 BioCarta and 14 Kyoto Encyclopedia of Genes and Genomes pathways that are up-regulated and 23 BioCarta and 22 Kyoto Encyclopedia of Genes and Genomes pathways that are down-regulated in LNCaP cells versus CL1 cells. Our efforts represent a significant step toward a systems approach to understanding prostate cancer progression.


    Introduction
 Top
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 References
 
Prostate cancer is the most common nondermatologic cancer in the United States (1). Initially, its growth is androgen dependent; early-stage therapies, including chemical and surgical castration, kill cancerous cells by androgen deprivation. Although such therapies produce tumor regression, they eventually fail because most prostate carcinomas become androgen independent (2). To improve the efficacy of prostate cancer therapy, it is necessary to understand the molecular mechanisms underlying the transition from androgen dependence to androgen independence.

The transition from androgen-dependent to androgen-independent status likely results from multiple processes, including activation of oncogenes, inactivation of tumor suppressor genes, and changes in key components of signal transduction pathways and gene regulatory networks. Systems approaches to biology and disease are predicated on the identification of the elements of the systems, the delineation of their interactions, and their changes in distinct disease states. Biological information is of two types: the digital information of the genome (e.g., genes and cis-control elements) and environmental cues. Normal protein and gene regulatory networks may be perturbed by disease, through genetic and/or environmental perturbations, and understanding these differences lies at the heart of systems approaches to disease. Disease-perturbed networks initiate altered responses that bring about pathologic phenotypes, such as the invasiveness of cancer cells.

To map network perturbations in cancer initiation and progression, one must measure changes in expression levels of virtually all transcripts. Certain low-abundance transcripts, such as those encoding transcription factors and signal transducers, wield significant regulatory influences in spite of the fact they may be present in the cell at very low copy numbers. Differential display (3) or cDNA microarrays (4, 5) have been used to profile changes in gene expression during the androgen-dependent to androgen-independent transition; however, those technologies can identify only a limited number of more abundant mRNAs, and they miss many low-abundance mRNAs due to their low detection sensitivities. Massively parallel signature sequencing (MPSS), a recently introduced method, allows 20-nucleotide signature sequences to be determined in parallel for >1,000,000 DNA sequences from an individual cDNA library or cell state (6). The frequency of each MPSS signature was calculated for each sample and represented in transcripts per million (tpm). MPSS technology allows identification and cataloging of almost all mRNAs, even those with one or a few transcripts per cell. Differentially expressed genes thus identified can be mapped onto cellular networks to provide a systemic understanding of changes in cellular state.

Although transcriptome (mRNA levels) differences are easier to study than proteome (protein levels) differences, cellular functions are usually performed by proteins. RNA expression profiling studies do not address how the encoded proteins function biologically, and transcript abundance levels do not always correlate with protein abundance levels (7). We therefore complemented our mRNA expression profiling with a more limited protein profiling by using isotope-coded affinity tags (ICAT) coupled with tandem mass spectrometry (MS/MS; ref. 8).

The LNCaP cell line is a widely used androgen-sensitive model for early-stage prostate cancer from which androgen-independent sublines have been generated (4, 5, 9). The cells of one such variant, CL1, in contrast to their LNCaP progenitors, are highly tumorigenic and exhibit invasive and metastatic characteristics in intact and castrated mice (9, 10). Thus, CL1 cells model late-stage prostate cancer. MPSS and ICAT data extracted from these model cell lines can be validated by real-time reverse transcription-PCR (RT-PCR) or Western blot analysis in more relevant biological models (tumor xenografts) and in tumor biopsies.

We conducted a MPSS analysis of ~5 million signatures for the androgen-dependent LNCaP cell line and its androgen-independent derivative CL1. Our database offers the first comprehensive view of the digital transcriptomes of two states of prostate cancer cells and allows us to explore the cellular pathways perturbed during the transition from androgen-dependent to androgen-independent growth. We additionally compared protein expression profiles between LNCaP and CL1 cells using ICAT-MS/MS technology. These are the first steps toward a systems approach to disease through an integrative, systemic understanding of prostate cancer progression at the mRNA, protein, and network levels.


    Materials and Methods
 Top
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 References
 
Massively parallel signature sequencing analysis. LNCaP and CL1 cells were grown as described by Tso et al. (10). MPSS cDNA libraries were constructed, and individual cDNA sequences were amplified, attached to individual beads, and sequenced as described elsewhere (6). The resulting signatures, generally 20 bases long, were annotated using the then most recently annotated human genome sequence (Human Genome Release hg16, released in November 2003) and the human Unigene (Unigene Build 171, released in July 2004) according to a previously published method (11). We considered only 100% matches between a MPSS signature and a genome signature. We also excluded those signatures that expressed at <3 tpm in both LNCaP and CL1 libraries, as they might not be reliably detected (12). Additionally, we classified cDNA signatures by their positions relative to polyadenylation signals and polyadenylic acid [poly(A)] tails and by their orientation relative to the 5'-3' orientation of source mRNA. The Z-test (13, 14) was used to calculate Ps for comparison of gene expression levels between the cell lines.

Isotope-coded affinity tag analysis. ICAT reagents were purchased from Applied Biosystems, Inc. (Foster City, CA) Fractionation of cells into cytosolic, microsomal, and nuclear fractions (15), as well as ICAT labeling, MS/MS, and data analyses, were done as described by Han et al. (15). In addition, probability score analysis (16) and Automated Statistical Analysis on Protein Ratio (17) were used to assess the quality of MS spectra and to calculate protein ratios from multiple peptide ratios. Descriptions of these software tools are available at http://regis.systemsbiology.net/software. To compare protein and mRNA expression levels, the Unigene numbers of the differentially expressed proteins were used to find MPSS signatures and their expression levels in tpm. If one Unigene had more than one MPSS signature likely due to alternative terminations, the average tpm of all signatures was taken.

Real-time reverse transcription-PCR. All primers were designed with the PRIMER3 program (http://www-genome.wi.mit.edu/cgi-bin/primer/primer3_www.cgi) and BLAST searched against the human cDNA and expressed sequence tag (EST) database for uniqueness. Primer sequences and PCR conditions are available on request. Real-time PCR was done on an ABI 7700 machine (Applied Biosystems), and SYBR Green dye (Molecular Probes, Inc., Eugene, OR) was used as a reporter. PCR conditions were designed to give bands of the expected size with minimal primer dimer bands.

Identification of perturbed networks. Genes in the 314 BioCarta and 155 Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways or networks (http://cgap.nci.nih.gov/Pathways/) were downloaded and compared with the MPSS data using Unigene IDs as identifiers. If a Unigene ID or an Enzyme Classification number corresponded to multiple signatures potentially due to multiple alternatively terminated isoforms, the tpm counts of the isoforms were combined and then subjected to the Z-test (13, 14). Genes with Ps of ≤0.001 were considered to be significantly differentially expressed. The following criteria were used to identify perturbed networks: a perturbed network must have more than three genes represented on our differentially expressed gene list (P < 0.001) and at least 50% of those genes must be up-regulated (an up-regulated pathway) or down-regulated (a down-regulated pathway).

Prediction of secreted proteins. Proteins with signal peptides (classic secretory proteins) were predicted using the same criteria described by Chen et al. (18) with the SignalP 3.0 server (http://www.cbs.dtu.dk/services/SignalP-3.0/) and the TMHMM2.0 server. Putatively nonclassic secretory secreted proteins (without signal peptides) were predicted based on the SecretomeP 1.0 server (http://www.cbs.dtu.dk/services/SecretomeP-1.0/) and required an odds ratio score of >3.0.


    Results
 Top
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 References
 
Massively parallel signature sequencing analyses of the androgen-dependent LNCaP cell line and its androgen-independent variant CL1. Using MPSS technology, we sequenced 2.22 million signature sequences for LNCaP cells and 2.96 million for CL1 cells. We identified a total of 19,595 unique transcript signatures expressed at levels >3 tpm in at least one of the samples. The signatures were classified into three major categories: 1,093 signatures matched repeat sequences, 15,541 signatures matched unique cDNAs or ESTs, and 2,961 signatures had no matches to any cDNA or EST sequences (but did match genomic sequences). The last category included sequences falling into one of three different categories: signatures representing new transcripts yet to be defined, signatures representing polymorphisms in cDNA sequences (a match of a MPSS sequence to cDNA or EST sequences requires 100% sequence identity), or errors in the MPSS reads. Transcript tags with matches to a cDNA or EST sequence were further classified based on the signatures' relative orientation to transcription direction and their position relative to a polyadenylation site and/or poly(A) tail. We also built a searchable MySQL database (http://www.mysql.com) containing the expression levels (tpm), the genomic locations of the MPSS sequences, the cDNAs or EST matches, and the classification of each signature. A detailed description of the schema for classification is available in Supplementary Table S1. A snapshot of a representative data query is shown in Supplementary Fig. S1.

We first restricted our analysis to those MPSS signatures corresponding to cDNAs with poly(A) tails and/or polyadenylation sites, so that corresponding genes could be conclusively identified. We used the Z-test (13, 14) to compare differential gene expression between LNCaP and CL1 cells. Using very stringent Ps (<0.001), we identified 2,088 MPSS signatures (corresponding to 1,987 unique genes, as some genes have two or more MPSS signatures due to alternative uses of polyadenylation sites) with significant differential expression. Of these, 1,011 signatures (965 genes) were overexpressed in CL1 cells and 1,077 signatures (1,022 genes) were overexpressed in LNCaP cells (Supplementary Table S2). The Z-score is related to mRNA abundance in the library. For example, using a cutoff P of <0.001 in our data set, the expression level in tpm changed from 0 to 26 tpm for the most lowly expressed transcript (>26-fold) but changed from 7,591 and 11,206 tpm for the most highly expressed transcript (1.48-fold).

We randomly selected nine genes from the 1,987 differentially expressed genes identified by our MPSS analysis and compared their changes in expression levels with those obtained by quantitative real-time RT-PCR techniques. We showed that the expression levels of these nine genes changed in the same direction (Table 1). The MPSS expression profiling data were also consistent with the available published data. For example, using RT-PCR, Patel et al. (9) showed that CL1 tumors express barely detectable prostate-specific antigen (PSA) and androgen receptor mRNAs compared with LNCaP cells. Our MPSS results indicated that LNCaP cells expressed 584 tpm of androgen receptor and 841 tpm of PSA; CL1 cells did not express either androgen receptor or PSA (0 tpm in both cases). Freedland et al. found that CD10 expression was lost in CL1 cells compared with LNCaP cells (19); likewise, we found that CD10 was expressed at 0 tpm in CL1 cells but at 56 tpm in LNCaP cells. Using cDNA microarrays, Vaarala et al. (4) compared LNCaP cells and another androgen-independent variant, non-PSA-producing LNCaP line, which is similar to CL1, and identified a total of 56 differentially expressed genes. We found that the expression levels of these 56 genes changed in the same direction (concordant) between LNCaP and CL1 cells and between LNCaP and non-PSA-producing LNCaP cells (data not shown). This identification of 1,987 versus 56 differentially expressed genes, respectively, underscores the striking differences in sensitivity between MPSS and cDNA microarray techniques.


View this table:
[in this window]
[in a new window]
 
Table 1. Comparison of MPSS and real-time RT-PCR results

 
To compare the sensitivity of the MPSS and cDNA microarray procedures, we hybridized cDNA microarrays containing 40,000 human cDNAs to the same LNCaP and CL1 RNAs that we used for MPSS. Three replicate array hybridizations were done. MPSS signatures and array clone IDs were mapped to Unigene IDs for data extraction and comparisons. We found that only those genes expressed at >40 tpm by MPSS could be reliably detected as changing levels by cDNA microarray hybridizations [judged by an expression level twice the SD of the background, a standard cutoff value for microarray data analysis (data not shown)]. This observation is consistent with the 33 to 60 tpm sensitivity of microarrays estimated from the experiment of Hill et al. (20), in which known concentrations of synthetic transcripts were added. In LNCaP and CL1 cells, ~68.75% (13,471 of 19,595) of MPSS signatures (>3 tpm) were expressed at a level below 40 tpm; changes in the levels of these genes will be missed by microarray methods. Many attempts have been made to increase the sensitivity of DNA array technology (21, 22). We have not compared these new improvements against MPSS, but it is clear that there will still be significant differences in the levels of change that can be detected.

Serial analysis of gene expression (SAGE; ref. 23) is another technology for gene expression profiling; like MPSS, it is digital and can generate a large number of signature sequences. However, MPSS (~1 million signatures per sample,) can achieve a much deeper coverage than SAGE (typically ~10,000-100,000 signatures sequenced per sample) at reasonable cost. We compared our MPSS data on LNCaP cells against publicly available SAGE data on LNCaP cells (National Center for Biotechnology Information SAGE database) through common Unigene IDs. The SAGE library GSM724 (total SAGE tags sequenced: 22,721; ref. 24) is derived from LNCaP cells with an inactivated PTEN gene; it is the SAGE library most similar to our LNCaP cells. Only 400 (~20%) of our 1,987 significantly differentially expressed genes (P < 0.001) had any SAGE tag entry in GSM724. These data illustrate the importance of deep sequence coverage in identifying state changes in transcripts expressed at low-abundance levels.

Functional classifications of genes differentially expressed between LNCaP and CL1 cells. Examination of the Gene Ontology classification of our 1,987 genes revealed that multiple cellular processes have changed during the transition from LNCaP to CL1 cells. The completed list, including Gene Ontology annotations, is shown in Supplementary Table S2. The most interesting groups, categorized by function, are shown in Table 2.


View this table:
[in this window]
[in a new window]
 
Table 2. Examples of differentially expressed genes and their functional classifications

 
Nineteen differentially expressed proteins are related to apoptosis. Twelve of these are up-regulated in CL1 cells, including the apoptosis inhibitors human T-cell leukemia virus type I binding protein 1 and CASP8 and FADD-like apoptosis regulator. Seven are down-regulated in CL1, including programmed cell death 8 and 5 (apoptosis-inducing factors) and BCL2-like 13 (an apoptosis facilitator). Because CL1 cells have increased expression of apoptosis inhibitors and decreased expression of apoptosis inducers, net inhibition of apoptosis may contribute to their greater tumorigenicity. Matrix metalloproteinases (MMP), which degrade extracellular matrix components that physically impede cell migration, are implicated in tumor cell growth, invasion, and metastasis. We found that MMPs 1, 2, 10, and 13 are significantly overexpressed in CL1 cells (Table 2), which may partially explain these cells' aggressive and metastatic behavior.

CD markers are generally localized at the cell surface; some may be associated with prostate cancer (25). We converted all currently identified CD markers (CD1-CD247) from the PROW CD index database (http://www.ncbi.nlm.nih.gov/prow/guide/45277084.htm) to Unigene numbers and used these numbers to identify their signatures and their expression levels. We identified 15 CD markers that are differentially expressed between LNCaP and CL1 cells (Z-score < 0.001; Table 2). Eleven CD markers, including CD213a2 and CD213a1, which encode interleukin (IL)-13 receptors {alpha}1 and {alpha}2, are up-regulated in CL1 cells; three CD markers, CD9, CD10, and CD107, are down-regulated in these cells (Table 2). Six CD markers went from 0 or 1 to >35 tpm (Table 2), making them good digital or absolute markers or therapeutic targets. These data suggest that carefully selected CD markers may be useful in following the progression of prostate cancer and indeed could serve as potential targets for antibody-mediated therapies (25). Additional functional categories can be seen in Supplementary Table S2.

Delineation of disease-perturbed networks in prostate cancer cells. Genes and proteins rarely act alone but rather generally operate in networks of interactions. Identifying key nodes (proteins) in the disease-perturbed networks may provide insights into effective drug targets. Comparing the genes (proteins) currently available in the 314 BioCarta and 155 KEGG pathway or network (http://cgap.nci.nih.gov/Pathways/) databases with the MPSS data through Unigene IDs, we identified 37 BioCarta and 14 KEGG pathways that are up-regulated and 23 BioCarta and 22 KEGG pathways that are down-regulated in LNCaP cells versus CL1 cells (Table 3). The number of genes whose expression patterns changed in each pathway is listed in Table 3. Each gene along with its expression level in LNCaP and CL1 cells is listed pathway by pathway in our database (ftp://ftp.systemsbiology.net/pub/blin/mpss). Changes in these pathways reveal the underlying phenotypic differences between LNCaP and CL1 cells. For example, multiple networks involved in modulating cell mobility, adhesion, and spreading are up-regulated in CL1 cells, which are more metastatic and invasive than LNCaP cells (Table 3). In the uCalpain and friends in cell spread pathway, calpains are calcium-dependent thiol proteases implicated in cytoskeletal rearrangements and cell migration. During cell migration, calpain cleaves target proteins, such as talin, ezrin, and paxillin, at the leading edge of the membrane while at the same time cleaving the cytoplasmic tails of the integrins ß1(a) and ß3(b) to release adhesion attachments at the trailing membrane edge. Increased activity of calpains increases migration rates and facilitates cell invasiveness (26).


View this table:
[in this window]
[in a new window]
 
Table 3. Pathways that are up-regulated or down-regulated comparing LNCaP cells to CL1 cells

 
Many pathways we identified as perturbed in the LNCaP and CL1 comparison are interconnected to form networks (in fact, there are probably no discrete pathways, only networks). For example, the insulin signaling pathway, the signal transduction through IL-1 receptor pathway, and nuclear factor-{kappa}B (NF-{kappa}B) signaling pathway are interconnected through c-Jun, IL-1 receptor, and NF-{kappa}B. The mapping of genes onto networks/pathways will be an ongoing objective as more networks/pathways become available. Our transcriptome data will be an invaluable resource in delineating these relationships.

As gene regulatory networks controlled by transcription factors form the top layer of the hierarchy that controls the physiologic network, we sought to identify differentially expressed transcription factors. Of 554 transcription factors expressed in LNCaP and CL1 cells, 112 showed significantly different levels between the cell lines (P < 0.001; Supplementary Table S3). This clearly showed significant difference in the functioning of the corresponding gene regulatory networks during the progression of prostate cancer from the early to late stages.

As secreted proteins can readily be exploited for blood cancer diagnosis and prognosis, we next asked how many of our differentially expressed genes encode secreted proteins. We identified 521 signatures belonging to 460 genes potentially encoding secreted proteins (Supplementary Table S6). Among these, 287 (259 genes) and 234 (201 genes) signatures, respectively, are overexpressed or underexpressed in CL1 cells compared with LNCaP cells. Thus, one can think about using blood diagnostics (changes in relevant protein concentrations) to follow prostate cancer progression.

Quantitative proteomic analysis of prostate cancer cells. We quantitatively profiled the protein expression changes between LNCaP and CL1 cells using the ICAT-MS/MS protocol described by Han et al. (15). We generated a total of 142,849 MS/MS, 7,282 of which corresponded to peptides with a mass spectrum quality score P of >0.9 (allowing unambiguous identification of peptides; ref. 16). We obtained quantitative peptide ratios for 4,583 peptides corresponding to 940 proteins. The number of peptides is greater than the number of proteins because (a) mass spectrometry identified multiple peptides from the same protein and (b) the ionization step of mass spectrometry created different charge states for the same peptide. The protein ratios were calculated from multiple peptide ratios using an algorithm for the Automated Statistical Analysis on Protein Ratio (17). In the end, we identified 82 proteins that are down-regulated and 108 proteins that are up-regulated by at least 1.8-fold in LNCaP cells compared with CL1 cells. The functional classification of the proteins identified is shown in Supplementary Table S4.

Fifty-four percent (103 of 190) of differentially expressed proteins identified have enzymatic activity. Many of the proteins identified are involved in fatty acid and lipid metabolism, including fatty acid synthase, carnitine palmitoyltransferase II, and propionyl CoA carboxylase {alpha} polypeptide. Fatty acid and lipid metabolism is perturbed in prostate cancer (27, 28). Additionally, many genes involved in lipid transport were altered, including five Annexin family proteins, prosaposin, and fatty acid binding protein 5 (Supplementary Table S4). Annexin A1 was shown to be overexpressed in non-PSA-producing LNCaP cells compared with PSA-producing LNCaP cells (4). Annexin A7 is postulated to be a prostate tumor suppressor gene (29). Annexin A2 expression is reduced or lost in prostate cancer cells, and its re-expression inhibits prostate cancer cell migration (30).

Other genes we identified here have been implicated in carcinogenesis, including tumor suppressor p16 and insulin-like growth factor-II receptor (27, 31). Some genes have been implicated previously in prostate cancer, such as prostate cancer overexpressed gene 1 (POV1), which is overexpressed in prostate cancer (32), and {delta}1 and {alpha}1 catenin (cadherin-associated protein) and junction plakoglobin, which are down-regulated in prostate cancer cells (33). However, the potential relationships of most of the proteins identified here to prostate cancer require further elucidation. For example, transmembrane protein 4 (TMEM4), a gene predicted to encode a 182–amino acid type II transmembrane protein, is down-regulated ~2-fold in CL1 cells compared with LNCaP cells. MPSS data also indicated that TMEM4 is down-regulated ~2-fold in CL1 cells. Many type II transmembrane proteins, such as TMPRSS2, are overexpressed in prostate cancer patients (34). It will be interesting to see whether TMEM4 overexpression plays a primary role in prostate carcinogenesis. We also identified 12 proteins that have not been annotated or functionally characterized. The relationships between these novel proteins and prostate cancer also need further study.

Additionally, we sought to compare the changes in expression at the protein level in the two cell states with changes at the mRNA level. We converted the protein IDs and MPSS signatures to Unigene IDs to compare the MPSS data with the ICAT-MS/MS data. We limited this comparison to those with common Unigene IDs and with reliable ICAT ratios (SD <0.5) and ended up with a subset of 79 proteins. Of these, 66 genes (83.5%) were concordant in their changes in mRNA and protein levels of expression and 13 genes (16.5%) were discordant (i.e., having higher protein expression but lower mRNA expression or vice versa). The scatter plot of protein/mRNA expression ratios is shown in Fig. 1. There are no functional similarities among the discordant genes. As these mRNAs and proteins are expressed at relatively high levels, discordance due to measurement errors is unlikely. Clearly, post-transcriptional mechanism(s) of protein expression is important, although the elucidation of the specific mechanism(s) awaits further studies.



View larger version (9K):
[in this window]
[in a new window]
 
Figure 1. Scatter plot of the protein ratios obtained by ICAT and the mRNA expression ratios obtained by MPSS. Expression ratios in Supplementary Table S3 were transformed to natural logarithms and then plotted.

 

    Discussion
 Top
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 References
 
The systems approach to disease is predicated on the idea that the disease process is reflected in disease-perturbed protein and gene regulatory networks. Molecular systems biology has two important features: (a) it employs global analyses where global implies studying changes in transcript or protein levels as well as the relationships of all of the elements in the system and (b) it integrates different types of biological information (single nucleotide polymorphisms, DNA, mRNA, protein, protein interactions, etc.). MPSS is a powerful and sensitive technology that allows deep analysis of the prostate transcriptome. The MPSS protocol we used for this study relies on GATC enzymatic sites to cleave the 3' region of cDNAs to generate DNA fragments as substrates for MPSS. cDNAs lacking GATC in their 3' region would be excluded from these analyses. The estimated percentage of cDNA clones lacking an appropriately positioned GATC site is ~3% as calculated from the Mammalian Genome Collection full-length sequences. Among the 15,064 Mammalian Genome Collection sequences, 14,602 (96.93%) sequences have appropriate GATC sites. The protocol we used is also biased toward capturing MPSS signatures within 500 bp 5' of the poly(A) site. If the GATC site is located beyond 500 bp 5' of the poly(A) site, it will likely be missed as well. For example, NKX3.1, a prostate-specific and androgen-regulated gene (35, 36), is not found in our MPSS data set because its GATC site closest to the poly(A) tail (Genbank accession no. AF247704) is 2.8 kb away. Recently, a new protocol that eliminates this bias was developed at Lynx.4 We estimated that LNCaP cells expressed ~280,000 transcripts per cell. We obtained ~900 µg of total RNA from 108 cells. With an average of 3% polyadenylated RNA and an average transcript length of 1 kb, this corresponds to 280,000 transcripts per cell. Therefore, with >2 million signatures obtained for each cell state by MPSS, we can detect transcripts expressed at levels of <1 transcript per cell (this means that not all cells express the transcript).

The BioCarta and KEGG databases describe 469 protein pathways or networks (http://cgap.nci.nih.gov/Pathways/). We have identified 37 BioCarta and 14 KEGG pathways that are up-regulated and 23 BioCarta and 22 KEGG pathways that are down-regulated in LNCaP cells versus CL1 cells. We have also shown that 112 transcription factors change between these two disease states, consistent with the fact that several different gene regulatory networks are perturbed. These changes indicate significant alterations of the corresponding gene regulatory networks. These transcription factors include androgen receptor along with other six transcription factors, such as the ets homologous factor, a liver-specific bHLH-Zip transcription factor, an IFN regulatory factor, and CCCTC-binding factor (zinc finger protein; by exploring data in Supplementary Table S3). The fascinating question is which of these networks are directly correlated with prostate cancer progression and which are changed secondarily as a consequence of their connections to the primary disease networks. We are working on strategies to distinguish these possibilities. Nevertheless, we can firmly conclude that the progression from early-stage to late-stage prostate cancer as represented by LNCaP and CL1 cells clearly is reflected in significant changes in both protein and gene regulatory networks.

In contrast to the MPSS technology, the ICAT technology is an immature technology that cannot now carry out global analyses (37). The integration of different types of data provides powerful new approaches to defining more precisely protein and gene regulatory networks (38). We have shown that the protein and RNA expression levels of 66 of 79 genes (83.5%) were concordant (i.e., changes in the same direction; Supplementary Table S5). This concordance rate is higher than that reported elsewhere (39, 40). Waghray et al. found that only 8 of 25 (32%) androgen-responsive genes in LNCaP cells showed concordance between protein levels measured by two-dimensional gels and MS/MS and mRNA levels analyzed by SAGE (39). Although genes in different experimental systems may have different concordance rates between mRNA and protein expression, use of different methods for quantitative protein profiling (ICAT-MS/MS versus two-dimensional gel-MS/MS) and mRNA expression profiling (MPSS versus SAGE) may also account for the differences. It is also critical to use only those data with high confidence levels in the comparisons between mRNA and protein levels. The expression levels obtained by MPSS are more accurate than those obtained by SAGE or DNA microarrays because of the deep sequence coverage MPSS achieves. We have also limited our data set to only those proteins (649 of them) that were identified in multiple peptide hits and in which the ICAT ratios did not vary greatly among different peptides from the same protein (SD < 0.5). Such variation could derive from experimental errors or from different protein isoforms. There are a multiplicity of post-transcriptional mechanisms that have been described and there are probably more to be identified (41). The important point is that this major aspect of control could not have been identified without the integration of two data types—mRNAs and proteins.

The systems approach provides powerful new approach to diagnostics. The idea is that disease-perturbed networks change their patterns of mRNA and protein expression both within the diseased cells and in terms of the proteins they synthesize that are secreted into the blood. Of the 1,987 mRNAs that changed in the transition from LNCaP to CL1 cells (early-stage to late-stage cancer), 460 (23.2%) encoded proteins that were potentially secreted (Supplementary Table S6). Sixteen of these putative secreted proteins were also identified to be differentially expressed in these two cell states by the ICAT approach (Supplementary Table S6). Of the 190 differentially expressed proteins identified by the ICAT approaches, 22 were predicted to be secreted proteins (Supplementary Table S6). These proteins are excellent candidates for investigation as diagnostic markers for prostate cancer progression. The interesting point is that these secreted diagnostic markers will serve as surrogates for the state of the corresponding protein and gene regulatory networks and potentially will enable one to (a) stratify disease into distinct categories (e.g., relatively benign, slowly invasive, and rapidly metastatic for prostate cancer), for these different types of prostate cancer will employ different disease-perturbed networks; (b) follow progression; (c) follow response to therapy; and (d) monitor adverse drug reactions. The other interesting possibility is that the perturbed secreted proteins will serve as markers to identify the primary disease-perturbed networks and accordingly will identify networks that may harbor excellent protein candidates for drug targeting—drug targets that may kill disease cells specifically or return the networks to a more normal state.

Interestingly, these two states of prostate cancer progression can lead to "digital changes" (i.e., changes from 0 to ≥50 tpm). Thus, one can possibly obtain diagnostic markers that are digital in the sense that they transition from no expression to some expression. In the transition from LNCaP cells to CL1 cells, there are 175 signatures (169 mRNAs) that go from 0 to ≥50 tpm. Likewise, in going from CL1 cells to LNCaP cells, there are 131 signatures (128 mRNAs; Supplementary Table S2). Among the transcription factors we identified, eight transcription factors changed from 0 tpm in LNCaP to >50 tpm in CL1 cells and seven transcription factors changed from >50 tpm in LNCaP cells to 0 tpm in CL1 cells (Supplementary Table S3). Eight pathways were affected by the "digital changes" (Supplementary Table S7). For example, acid ceramidase 1 and aspartate aminotransferase changed from >50 tpm in LNCaP cells to 0 tpm in CL1 cells, affecting multiple pathways, including the insulin-like growth factor-I receptor pathway and activation of COOH-terminal Srk kinase pathway (Supplementary Table S7). It will be interesting to test these potential digital diagnostic markers.

Our analyses provide an excellent database and powerful resource enabling the development of tools for multivariable diagnosis and prognosis. They represent a significant step toward a system-wide understanding of prostate cancer progression. The systems approach to disease will offer powerful to approaches to diagnostics, therapeutics, and even prevention in the future (42). It will almost certain usher in an era of predictive and preventive medicine over the next 10 to 20 years (43).


    Acknowledgments
 
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.


    Footnotes
 
Note: Supplementary data for this article are available at Cancer Research Online (http://cancerres.aacrjournals.org/).

4 Daixing Zhou, personal communication. Back

Received 11/19/04. Revised 1/24/05. Accepted 2/ 8/05.


    References
 Top
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 References
 

  1. Greenlee RT, Murray T, Bolden S, Wingo PA. Cancer statistics, 2000. CA Cancer J Clin 2000;50:7–33.[Abstract]
  2. Isaacs JT. The biology of hormone refractory prostate cancer. Why does it develop? Urol Clin North Am 1999;26:263–73.[CrossRef][Medline]
  3. Bussemakers MJ, van Bokhoven A, Verhaegh GW, et al. DD3: a new prostate-specific gene, highly overexpressed in prostate cancer. Cancer Res 1999;59:5975–9.[Abstract/Free Full Text]
  4. Vaarala MH, Porvari K, Kyllonen A, Vihko P. Differentially expressed genes in two LNCaP prostate cancer cell lines reflecting changes during prostate cancer progression. Lab Invest 2000;80:1259–68.[Medline]
  5. Chang GT, Blok LJ, Steenbeek M, et al. Differentially expressed genes in androgen-dependent and -independent prostate carcinomas. Cancer Res 1997;57:4075–81.[Abstract/Free Full Text]
  6. Brenner S, Johnson M, Bridgham J, et al. Gene expression analysis by massively parallel signature sequencing (MPSS) on microbead arrays. Nat Biotechnol 2000;18:630–4.[CrossRef][Medline]
  7. Chen G, Gharib TG, Huang CC, et al. Discordant protein and mRNA expression in lung adenocarcinomas. Mol Cell Proteomics 2002;1:304–13.[Abstract/Free Full Text]
  8. Gygi SP, Rist B, Gerber SA, Turecek F, Gelb MH, Aebersold R. Quantitative analysis of complex protein mixtures using isotope-coded affinity tags. Nat Biotechnol 1999;17:994–9.[CrossRef][Medline]
  9. Patel BJ, Pantuck AJ, Zisman A, et al. CL1-GFP: an androgen independent metastatic tumor model for prostate cancer. J Urol 2000;164:1420–5.[CrossRef][Medline]
  10. Tso CL, McBride WH, Sun J, et al. Androgen deprivation induces selective outgrowth of aggressive hormone-refractory prostate cancer clones expressing distinct cellular and molecular properties not present in parental androgen-dependent cancer cells. Cancer J Sci Am 2000;6:220–33.
  11. Meyers BC, Tej SS, Vu TH, et al. The use of MPSS for whole-genome transcriptional analysis in Arabidopsis. Genome Res 2004;14:1641–53.[Abstract/Free Full Text]
  12. Jongeneel CV, Iseli C, Stevenson BJ, et al. Comprehensive sampling of gene expression in human cell lines with massively parallel signature sequencing. Proc Natl Acad Sci U S A 2003;100:4702–5.[Abstract/Free Full Text]
  13. Man MZ, Wang X, Wang Y. POWER_SAGE: comparing statistical tests for SAGE experiments. Bioinformatics 2000;16:953–9.[Abstract/Free Full Text]
  14. Kal AJ, van Zonneveld AJ, Benes V, et al. Dynamics of gene expression revealed by comparison of serial analysis of gene expression transcript profiles from yeast grown on two different carbon sources. Mol Biol Cell 1999;10:1859–72.[Abstract/Free Full Text]
  15. Han DK, Eng J, Zhou H, Aebersold R. Quantitative profiling of differentiation-induced microsomal proteins using isotope-coded affinity tags and mass spectrometry. Nat Biotechnol 2001;19:946–51.[CrossRef][Medline]
  16. Keller A, Nesvizhskii AI, Kolker E, Aebersold R. Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search. Anal Chem 2002;74:5383–92.[Medline]
  17. Li XJ, Zhang H, Ranish JA, Aebersold R. Automated statistical analysis of protein abundance ratios from data generated by stable-isotope dilution and tandem mass spectrometry. Anal Chem 2003;75:6648–57.[Medline]
  18. Chen Y, Yu P, Luo J, Jiang Y. Secreted protein prediction system combining CJ-SPHMM, TMHMM, and PSORT. Mamm Genome 2003;14:859–65.[CrossRef][Medline]
  19. Freedland SJ, Seligson DB, Liu AY, et al. Loss of CD10 (neutral endopeptidase) is a frequent and early event in human prostate cancer. Prostate 2003;55:71–80.[CrossRef][Medline]
  20. Hill AA, Hunter CP, Tsung BT, Tucker-Kellogg G, Brown EL. Genomic analysis of gene expression in C. elegans. Science 2000;290:809–12.[Abstract/Free Full Text]
  21. Han M, Gao X, Su JZ, Nie S. Quantum-dot-tagged microbeads for multiplexed optical coding of biomolecules. Nat Biotechnol 2001;19:631–5.[CrossRef][Medline]
  22. Bao P, Frutos AG, Greef C, et al. High-sensitivity detection of DNA hybridization on microarrays using resonance light scattering. Anal Chem 2002;74:1792–7.[Medline]
  23. Velculescu VE, Vogelstein B, Kinzler KW. Analysing uncharted transcriptomes with SAGE. Trends Genet 2000;16:423–5.[CrossRef][Medline]
  24. Lal A, Lash AE, Altschul SF, et al. A public database for gene expression in human cancers. Cancer Res 1999;59:5403–7.[Abstract/Free Full Text]
  25. Liu AY, True LD, LaTray L, et al. Analysis and sorting of prostate cancer cell types by flow cytometry. Prostate 1999;40:192–9.[CrossRef][Medline]
  26. Perrin BJ, Huttenlocher A. Calpain. Int J Biochem Cell Biol 2002;34:722–5.[CrossRef][Medline]
  27. Pandian SS, Eremin OE, McClinton S, Wahle KW, Heys SD. Fatty acids and prostate cancer: current status and future challenges. J R Coll Surg Edinb 1999;44:352–61.[Medline]
  28. Fleshner N, Bagnell PS, Klotz L, Venkateswaran V. Dietary fat and prostate cancer. J Urol 2004;171:S19–24.[CrossRef][Medline]
  29. Cardo-Vila M, Arden KC, Cavenee WK, Pasqualini R, Arap W. Is Annexin 7 a tumor suppressor gene in prostate cancer? Pharmacogenomics J 2001;1:92–4.[Medline]
  30. Liu JW, Shen JJ, Tanzillo-Swarts A, et al. Annexin II expression is reduced or lost in prostate cancer cells and its re-expression inhibits prostate cancer cell migration. Oncogene 2003;22:1475–85.[CrossRef][Medline]
  31. Chi SG, deVere White RW, Muenzer JT, Gumerlock PH. Frequent alteration of CDKN2 (p16(INK4A)/MTS1) expression in human primary prostate carcinomas. Clin Cancer Res 1997;3:1889–97.[Abstract]
  32. Cole KA, Chuaqui RF, Katz K, et al. cDNA sequencing and analysis of POV1 (PB39): a novel gene up-regulated in prostate cancer. Genomics 1998;51:282–7.[CrossRef][Medline]
  33. Kallakury BV, Sheehan CE, Winn-Deen E, et al. Decreased expression of catenins ({alpha} and ß), p120 CTN, and E-cadherin cell adhesion proteins and E-cadherin gene promoter methylation in prostatic adenocarcinomas. Cancer 2001;92:2786–95.[CrossRef][Medline]
  34. Vaarala MH, Porvari K, Kyllonen A, Lukkarinen O, Vihko P. The TMPRSS2 gene encoding transmembrane serine protease is overexpressed in a majority of prostate cancer patients: detection of mutated TMPRSS2 form in a case of aggressive disease. Int J Cancer 2001;94:705–10.[CrossRef][Medline]
  35. Prescott JL, Blok L, Tindall DJ. Isolation and androgen regulation of the human homeobox cDNA, NKX3.1. Prostate 1998;35:71–80.[CrossRef][Medline]
  36. Bhatia-Gaur R, Donjacour AA, Sciavolino PJ, et al. Roles for Nkx3.1 in prostate development and cancer. Genes Dev 1999;13:966–77.[Abstract/Free Full Text]
  37. Aebersold R, Mann M. Mass spectrometry-based proteomics. Nature 2003;422:198–207.[CrossRef][Medline]
  38. Baliga NS, Bonneau R, Facciotti MT, et al. Genome sequence of Haloarcula marismortui: a halophilic archaeom from the Dead Sea. Genome Res 2004;14:2221–34.[Abstract/Free Full Text]
  39. Waghray A, Feroze F, Schober MS, et al. Identification of androgen-regulated genes in the prostate cancer cell line LNCaP by serial analysis of gene expression and proteomic analysis. Proteomics 2001;1:1327–38.[CrossRef][Medline]
  40. Chen G, Gharib TG, Huang CC, et al. Discordant protein and mRNA expression in lung adenocarcinomas. Mol Cell Proteomics 2002;1:304–13.
  41. Rajagopalan LE, Malter JS. Regulation of eukaryotic messenger RNA turnover. Prog Nucleic Acid Res Mol Biol 1997;56:257–86.[Medline]
  42. Hood L, Perlmutter RM. The impact of systems approaches on biological problems in drug discovery. Nat Biotechnol 2004;22:1215–7.[CrossRef][Medline]
  43. Hood L, Heath JR, Phelps ME, Lin B. Systems biology and new technologies enable predictive and preventative medicine. Science 2004;306:640–3.[Abstract/Free Full Text]



This article has been cited by other articles:


Home page
Brief Funct Genomic ProteomicHome page
L. Hood, L. Rowen, D. J. Galas, and J. D. Aitchison
Systems biology at the Institute for Systems Biology
Brief Funct Genomic Proteomic, June 25, 2008; (2008) eln027v1.
[Abstract] [Full Text] [PDF]


Home page
Clin. Cancer Res.Home page
B. Lin, A. G. Utleg, K. Gravdal, J. T. White, O. J. Halvorsen, W. Lu, L. D. True, R. Vessella, P. H. Lange, P. S. Nelson, et al.
WDR19 Expression is Increased in Prostate Cancer Compared with Normal Cells, but Low-Intensity Expression in Cancers is Associated with Shorter Time to Biochemical Failures and Local Recurrence
Clin. Cancer Res., March 1, 2008; 14(5): 1397 - 1406.
[Abstract] [Full Text] [PDF]


Home page
Mol. Cell. ProteomicsHome page
L. C. Tu, X. Yan, L. Hood, and B. Lin
Proteomics Analysis of the Interactome of N-myc Downstream Regulated Gene 1 and Its Interactions with the Androgen Response Program in Prostate Cancer Cells
Mol. Cell. Proteomics, April 1, 2007; 6(4): 575 - 588.
[Abstract] [Full Text] [PDF]


Home page
Clin. Chem.Home page
G. Sardana, J. Marshall, and E. P. Diamandis
Discovery of Candidate Tumor Markers for Prostate Cancer via Proteomic Analysis of Cell Culture-Conditioned Medium
Clin. Chem., March 1, 2007; 53(3): 429 - 437.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
D. F. Moore, O. V. Krokhin, R. C. Beavis, M. Ries, C. Robinson, E. Goldin, R. O. Brady, J. A. Wilkins, and R. Schiffmann
Proteomics of specific treatment-related alterations in Fabry disease: A strategy to identify biological abnormalities
PNAS, February 20, 2007; 104(8): 2873 - 2878.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
A. Ng, B. Bursteinas, Q. Gao, E. Mollison, and M. Zvelebil
pSTIING: a 'systems' approach towards integrating signalling pathways, interaction and transcriptional regulatory networks in inflammation and cancer
Nucleic Acids Res., January 1, 2006; 34(suppl_1): D527 - D534.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
S. Fitter and R. James
Deconvolution of a Complex Target Using DNA Aptamers
J. Biol. Chem., October 7, 2005; 280(40): 34193 - 34201.
[Abstract] [Full Text] [PDF]


Home page
Clin. Chem.Home page
E. Michelini, M. Magliulo, P. Leskinen, M. Virta, M. Karp, and A. Roda
Recombinant Cell-Based Bioluminescence Assay for Androgen Bioactivity Determination in Clinical Samples
Clin. Chem., October 1, 2005; 51(10): 1995 - 1998.
[Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Supplementary Data
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Lin, B.
Right arrow Articles by Hood, L.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Lin, B.
Right arrow Articles by Hood, L.


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
Cancer Research Clinical Cancer Research
Cancer Epidemiology Biomarkers & Prevention Molecular Cancer Therapeutics
Molecular Cancer Research Cancer Prevention Research
Cancer Prevention Journals Portal Cancer Reviews Online
Annual Meeting Education Book Meeting Abstracts Online