| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
Tumor Biology |
Division of Experimental Pathology, Department of Laboratory Medicine and Pathology, and Mayo Cancer Center, Mayo Clinic and Foundation, Rochester, Minnesota, 55905
| ABSTRACT |
|---|
|
|
|---|
| INTRODUCTION |
|---|
|
|
|---|
Electronic analysis of sequenced cDNAs uses the information in the EST database (dbEST) where >3.5 million sequences have been deposited from the analysis of over 6,000 human cancer and normal cDNA libraries constructed from >50 different tissues (6) .4 To use this large amount of information, computer algorithms have been developed for discovery of novel genes (7) and genes with limited tissue distribution and/or cancer-specific expression (8) . One limitation associated with the use of this dbEST is that only highly expressed genes have been sampled adequately to provide sufficient corresponding EST counts for reliable molecular profiling. However, identification of these highly expressed genes could provide significant information to enhance our understanding of carcinogenesis and serve as biomarkers or prognostic markers of malignancy. Prostate is an excellent tissue to study by EST analysis because of the sizable pool of EST data in the LCM-derived libraries.
Despite the existing search algorithms, there is still a need for sophisticated computer analysis tools to perform the clustering and analysis of the ESTs. In this study, we developed a novel electronic profiling algorithm, the Binary Indexing Search Algorithm, to identify differentially expressed genes in cancer and normal prostate EST databases. Functions of this novel algorithm include clustering ESTs to distinct genes, estimating tissue distribution of EST clusters, and sorting EST clusters by relative expression levels. This procedure, using the Binary Indexing Search Algorithm, requires
6 h of CPU time, and in this study, we successfully used this algorithm for the identification of genes in cancer and normal microdissected prostate EST libraries. The tumor-associated differential expression for two genes, CRISP-3 and DAN, was subsequently verified experimentally.
| MATERIALS AND METHODS |
|---|
|
|
|---|
Data Preparation.
The data preparation module reads EST report files downloaded from NCBI6
and produces files in FASTA format for ESTs.
Tissue Distribution of ESTs/Genes.
In the tissue distribution module, binary indexes were used to assign tissue distribution to each prostate EST. A binary index is defined as a binary string of n bits, where n is an integer. A sequence of n nucleotides can be represented by n-bit binary index when purines (As and Gs) are assigned to 1 and pyrimidines (Ts and Cs) are assigned to 0. After a series of testing, we found that a 32-bit binary index is long enough to rule out a random occurrence, such that under most circumstances a unique 32-bit index represents a unique EST. In this study, we used 32-bit continuous indexes to represent ESTs, and every EST was converted into groups of continuous 32-bit binary indexes.
In the first step, the program declares, in random access memory, a DA table for all possible indexes, in which each entry has one byte (8 bits) of memory reservation. The first distinct 29 bits of a 32-bit index are used as the DA for that index in the table, and the decimal value (dv) of the last three bits will give the bit position to be set to 1 in that entry. For example, a 32-bit index, 11110011110011101111100100011111, has a direct entry address 11110011110011101111100100011 in the table, and the 7th bit of that entry is set to 1 [dv = 2 (2) + 2 (1) + 2 (0) = 7]. We ran this module on classified prostate dbEST and created a complete nonredundant DA table containing all prostate indexes.
The algorithm then sets up a tissue distribution table (OS table) using the DA table. Each entry of the OS table contains two fields: a 32-bit index and a 64-bit specificity string (there are 52 different tissue types in human dbEST). This module runs on the entire human dbEST to get every possible index and check if each index is a prostate index using the DA table. If an index is a prostate index, the module builds an entry of this index in the OS table as follows. The direct address of this entry will be the first 24 bits of the index (we use 24-bit addresses to reduce the table size). In that address, we store the complete 32-bit index and the 64-bit specificity string of which the bit corresponding to the tissue where this index comes from will be set to 1. Because we used external chaining to store the indexes that have identical first 24-bits but different last 8 bits, the OS table is also a nonredundant table. After running the human dbEST on this module, each entry in the DA table has a cumulative tissue distribution string in OS table showing all of the tissues corresponding to the index. Therefore, the tissue distribution of each prostate EST can be derived from the OS table by accumulating the tissue distribution of all of the indexes related to that EST. For the same procedure, the tissue distribution of an EST cluster (see below) or a gene can be calculated by accumulating the tissue distributions of all of the ESTs in that cluster.
EST Clustering.
To cluster the ESTs belonging to the same gene together, we created a clustering module in our binary indexing search algorithm. All of the prostate ESTs in FASTA formatted files generated by data preparation module are read. Each EST was then converted into groups of continuous 32-bit binary indexes. For every individual EST, if none of its indexes pointed to any previously built cluster, the EST and all its indexes are assigned to a new cluster. Otherwise, if any of the indexes of an EST pointed to an already built cluster, this EST would be assigned to that cluster, and all its indexes would be set to point to the same cluster.
Sorting and Selecting EST Clusters.
Within each prostate EST cluster, we grouped ESTs according to the tissue histology (cancerous or normal) and the library preparation method (microdissected, bulk, or cell line). This resulted in five groups of ESTs including NB, NM, CB, CM, and CL. Then, we sorted clusters according to the total EST counts, yielding information regarding gene expression level in prostate, differential expression level of the same gene between normal and cancer. Furthermore, we selected the EST clusters with statistically significant EST counts and/or the statistically significant differentially displayed clusters according to the Fishers exact test.
Experimental Procedures
Tissue Section and H&E Staining.
Five cases of Gleason score 6 OCT-embedded prostate frozen cancer tissues as well as their corresponding normal tissues were used. Ten-µm cryostat sections were stained using H&E and dehydrated completely in xylene, following protocols from Arcturus Engineering (Mountain View, CA).
LCM.
The LCM technique allows for isolating normal and cancerous prostate epithelial cells precisely and efficiently from among a mixture of normal, cancer, epithelial, and nonepithelial cells. Navigated-LCM was performed with a PixCell II apparatus (Arcturus) on H&E sections. LCM parameters included a laser power of 65 milliwatts, laser pulse duration of 1.2 ms, and laser spot size of 7.515 µm in diameter. The infrared laser was pulsed over cells of interest. We captured approximately the same number of cancerous and normal epithelial cells (
2,000) in each case.
RT-PCR.
After LCM, total RNA extraction from the captured cells was performed using the RNeasy kit (Qiagen). The LCM-captured cells were immediately placed into sterile 0.5-ml microcentrifuge tubes containing 200 µl of RLT reagent with 1% ß-mercaptoethanol and inverted at room temperature for 1 h before extraction of total RNA. The RNA extraction was performed according to the instructions from RNeasy kit. Total RNA was eluted into 30 µl of nuclease-free H2O. Oligo-dT primers and SuperScriptII reverse transcriptase (Life Technologies, Inc.) were used in reverse transcription. The reverse transcriptase was performed following the manufacturers protocol. After the reverse transcription reaction, 40 rounds of PCR amplification were performed using gene-specific primers. The gene-specific forward and reverse primers were designed using the Primer 3 program.
PCR amplifications were performed in a total volume of 20 µl, with 1 µl of the reverse transcription products, 1 unit of Taq polymerase, 0.5 µM of each primer, 0.25 mM of each deoxynucleotide triphosphate in 1x PCR buffer.
Real-Time RT-PCR.
To obtain a more accurate estimate of the changes in mRNA expression levels in cancerous prostate epithelial cells versus normal/benign prostate epithelial cells, real-time RT-PCR was used. We used the RT product from RT-PCR as templates in this experiment, together with the TaqMan universal PCR MasterMix (Applied Biosystems), gene-specific primers, and dual-labeled TaqMan probes designed using the Primer Express 1.5 software (PE Biosystems). The highly gene-specific regions that do not display any sequence homology with their close family members were chosen for amplification.
| RESULTS |
|---|
|
|
|---|
|
Prostate adenocarcinoma is of epithelial origin. Therefore, we confined our analysis to CM and NM pools, which contain 22,776 and 9,967 ESTs, respectively, to avoid erroneous calculation of expression levels attributable to contamination from nonepithelial cells. After analyzing these two EST pools using the Binary Indexing Search Algorithm, the expression profile of close to 600 EST clusters in prostate CM and NM were identified. Fig. 2
displays a plot that represents the distribution of gene regulation in prostate for these 600 genes. EST clusters (distinct genes) were grouped according to a pseudo-ratio (r) of normalized numbers of ESTs in NM and CM pools, as follows:
![]() |
![]() |
|
3 even though they failed the Fishers exact test and the 1-tail test. The complete list of these 600 genes is available on line.7
|
chain silent (10)
and has been reported previously as a prostate-specific gene. For the 7 down-regulated genes, DAN has been reported as a tumor suppressor gene (11)
. RT14 is a human cDNA clone homologous to GCN5 (general control of amino acid synthesis-5), a yeast transcription activator. The function of RT14 is not clear. The DANCE gene encodes for a secretory protein that has been shown to promote adhesion of endothelial cells (12)
. The DANCE gene or its protein product has never been connected to prostate cancer. Tropomyosin is a member of the family of actin filament-binding proteins and has been reported to be down-regulated in prostate cancer (13)
. NPD017 is a novel gene without a putative ID in dbEST. The finding of semenogelin II as a down-regulated gene in prostate cancer is unexpected. Semenogelins are the predominant proteins in human semen and are the major proteins involved in the gelatinous entrapment of ejaculated spermatozoa (14)
. They are synthesized by the secretory epithelium of the seminal vesicles and should not be present in prostate microdissected EST libraries. It is possible that semenogelin II ESTs in prostate NM and CM libraries are the result of contamination of seminal vesicle cells in prostate tissue cDNA samples. In fact, when we carefully selected only cancer or normal prostate epithelial cells by LCM and performed RT-PCR, the semenogelin II band was absent, even after 40 cycles of the PCR reaction (Fig. 4d)
|
|
RT-PCR Confirms the Expression Regulation of CRISP-3 and DAN in Primary Prostate Cancer.
We examined the putative up-regulation of CRISP-3 and the putative down-regulation of DAN in prostate cancer using RT-PCR. PSA, Hk2, PAP, and GAPDH were used as standards. Total RNAs from
2,000 LCM-captured prostate epithelial cells (Gleason score 6 and the corresponding normal/ benign) were used in RT-PCR (Fig. 4, ac)
. As shown in Fig. 4d
, CRISP-3 and DAN exhibit different expression patterns between normal and cancer, whereas standards PSA, Hk2, PAP, and GAPDH show approximately equal expression levels in cancer and normal. We investigated 5 Gleason Score 6 prostate cancer cases and obtained similar results in all 5 cases.
Quantification of Differential Expression of CRISP-3 and DAN Using Real-Time RT-PCR.
To obtain a quantitative estimate of expression levels of the above 6 genes, real-time PCRs was performed next using TaqMan PE 7700 system (Applied Biosystems). The TaqMan assay uses the 5'
3' exonuclease activity of Taq DNA polymerase and a fluorogenic probe for automated quantification of DNA in a real-time manner. The CT value refers to the threshold cycle at which a statistically significant increase in fluorescence is first detected by the sequence detection system. The increase in fluorescence is directly proportional to the exponential increase in PCR products, and the measurement of signal is carried out in a real-time manner.
Because of the possible different amplification efficiencies of different genes, a validation experiment to calculate individual amplification efficiency for each gene was performed. The total RNA from the prostate cancer cell line LNCaP, which contains mRNA from all 6 genes, was used as the starting material. The CT values were obtained from starting total RNA of 10, 5, 2.5, 1.25, 0.625, and 0.3125 ng for each gene. The amplification efficiency of each gene was calculated following the manufacturers procedure (Table 2)
. The differential expression of a gene in cancer is computed as e(nCT - cCT), where e is the amplification efficiency and nCT and cCT denote the CT for normal and cancer, respectively. As shown in the table, GAPDH, PSA, Hk2, and PAP are approximately equally expressed in cancer and normal prostate epithelial cells. CRISP-3 is significantly up-regulated (50300-fold) in prostate cancer, whereas the tumor suppressor gene DAN is down-regulated 8494% in prostate cancer when compared with normal prostate epithelium.
|
-methyl-CoA-racemase (19)
, KIAA 1538 protein (20)
, human tyrosine kinase receptor axl (21)
, and PSGR (22)
were not present in our data sets because they did not pass the Fishers exact test or did not have ESTs in the CM and NM pool.
|
Microarray and GeneChip contain a limited probe set, usually 5,0006,000 known genes. We found that, except for the hypothetical protein FLJ21174, none of the other 8 up-regulated genes identified by our analysis were included in most microarray/GeneChip probe sets. In addition, microarray, GeneChip, and other profiling technologies usually require relatively large quantities of total RNA. As a consequence, bulk tissues (normal and cancerous) instead of LCM-captured cells are routinely used in the analysis. These profiling methods may identify genes that are not specific to normal or malignant prostatic epithelia. This may also add to the discrepancy in genes identified in our analysis when compared with other gene profiling methods. As result, the Binary Indexing Search Algorithm offers additional insight into gene expression that other profiling studies may not be able to provide.
| DISCUSSION |
|---|
|
|
|---|
Some consideration should be taken while viewing the output of the Binary Indexing Search Algorithm. For instance, because of the small size of some EST libraries from different organs or tissue types, there may be more organs or tissue types where the gene is expressed than identified by the algorithm. For example, CRISP-3 was found in four organs according to the computational calculation. However, CRISP-3 was found in seven different organs by multiple tissue dot blot including salivary gland, pancreas, prostate, ovary, thymus, testis, and colon.8
Prostate cancer is the most commonly diagnosed noncutaneous malignancy and the second leading cause of cancer-related deaths in the Western male population (23) . Currently, measurement of the serum PSA is the most sensitive biomarker for the detection of prostatic adenocarcinoma. However, an elevation in the serum PSA lacks specificity, and serum PSA may be elevated in common benign conditions of the prostate such as prostatitis and benign prostatic hypertrophy. Numerous studies have shown that only 25% of patients with an elevated serum PSA level between 4 and 10 ng/ml have an adenocarcinoma detected on prostate needle biopsy (24 , 25) . The lack of specificity for PSA results in unnecessary prostate needle biopsy procedures and patient anxiety. More specific biomarkers for prostate cancer are needed to improve our ability to detect prostate cancer.
Many studies, as well as our electronic profiling results, have shown that prostate cancer expression levels of PSA on a per cell basis do not change and may even decrease in high-grade prostate cancers.9
Our data demonstrate that the per-cell CRISP-3 mRNA level is significantly up-regulated in prostate cancer compared with normal tissue. In addition, there is evidence to indicate that CRISP-3 is a secretory protein. Therefore, CRISP-3 is a potential diagnostic marker for prostate cancer, and our subsequent studies will focus on the utility of CRISP-3 as a diagnostic marker for prostate cancer. Similar to PSA, CRISP-3 is also androgen responsive (26) , although the functions of CRISP-3 and its close family members remain largely unknown.
| FOOTNOTES |
|---|
1 These authors contributed equally to this work. ![]()
2 To whom requests for reprints should be addressed, at Mayo Clinic and Foundation, 200 First Street, SW, Rochester, MN 55905. Phone: (507) 266-4617; Fax: (507) 266-5193; E-mail: vasmatzis.george{at}mayo.edu. ![]()
3 The abbreviations used are: SAGE, serial analysis of gene expression; EST, expressed sequence tag; LCM, laser capture microdissection; GLS, Gene Library Summarizer; DA, direct addressing; RT-PCR, reverse transcription-PCR; GAPDH, glyceraldehyde-3-phosphate dehydrogenase; NM, normal microdissected; CM, cancer microdissected; CB, cancer bulk; NB, normal bulk; CRISP-3, cysteine-rich secretory protein 3; TARP, T-cell receptor gamma transcript; PSA, prostate-specific antigen. ![]()
4 Internet address: http://www.ncbi.nlm.nih.gov/dbEST/index.html. ![]()
5 Internet address: http://cgap.nci.nih. gov/Tissues/LibrarySummarizer. ![]()
6 Internet address: ftp://ncbi.nlm.nih.gov/repository/dbEST. ![]()
7 Internet address: http://www.mayo.edu/research/expath/prostate.html. ![]()
8 Y. W. Asmann, unpublished work. ![]()
Received 10/ 2/01. Accepted 4/ 5/02.
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
G. M. Gibbs, K. Roelants, and M. K. O'Bryan The CAP Superfamily: Cysteine-Rich Secretory Proteins, Antigen 5, and Pathogenesis-Related 1 Proteins--Roles in Reproduction, Cancer, and Immune Defense Endocr. Rev., December 1, 2008; 29(7): 865 - 897. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. S. Bjartell, H. Al-Ahmadie, A. M. Serio, J. A. Eastham, S. E. Eggener, S. W. Fine, L. Udby, W. L. Gerald, A. J. Vickers, H. Lilja, et al. Association of Cysteine-Rich Secretory Protein 3 and {beta}-Microseminoprotein with Outcome after Radical Prostatectomy Clin. Cancer Res., July 15, 2007; 13(14): 4130 - 4138. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Vasmatzis, E. W. Klee, D. M. Kube, T. M. Therneau, and F. Kosari Quantitating tissue specificity of human genes to facilitate biomarker discovery Bioinformatics, June 1, 2007; 23(11): 1348 - 1355. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Udby, A. Bjartell, J. Malm, A. Egesten, A. Lundwall, J. B. Cowland, N. Borregaard, and L. Kjeldsen Characterization and Localization of Cysteine-Rich Secretory Protein 3 (CRISP-3) in the Human Male Reproductive Tract J Androl, May 1, 2005; 26(3): 333 - 342. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. T. Posey, M. S. Soloway, S. Ekici, M. Sofer, F. Civantos, R. C. Duncan, and V. B. Lokeshwar Evaluation of the Prognostic Potential of Hyaluronic Acid and Hyaluronidase (HYAL1) for Prostate Cancer Cancer Res., May 15, 2003; 63(10): 2638 - 2644. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Kosari, Y. W. Asmann, J. C. Cheville, and G. Vasmatzis Cysteine-rich Secretory Protein-3: A Potential Biomarker for Prostate Cancer Cancer Epidemiol. Biomarkers Prev., November 1, 2002; 11(11): 1419 - 1426. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| Cancer Research | Clinical Cancer Research |
| Cancer Epidemiology Biomarkers & Prevention | Molecular Cancer Therapeutics |
| Molecular Cancer Research | Cancer Prevention Research |
| Cancer Prevention Journals Portal | Cancer Reviews Online |
| Annual Meeting Education Book | Meeting Abstracts Online |