| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
Molecular Biology, Pathobiology, and Genetics |
Departments of 1 Neurology, 2 Bioinformatics, and 3 Pathology, Erasmus MC, Rotterdam, the Netherlands
Requests for reprints: Pim J. French, Department of Neurology, Erasmus MC, P.O. Box 2040, 3000CA Rotterdam, the Netherlands. Phone: 311-040-88333; Fax: 11-31-10-4088365; E-mail: p.french{at}erasmusmc.nl.
| Abstract |
|---|
|
|
|---|
67% of these candidate novel exons was confirmed by RT-PCR. Our results indicate that exon level expression profiling can be used to molecularly classify brain tumor subgroups, can identify differentially regulated splice variants, and can identify novel exons. The splice variants identified by exon level expression profiling may help to detect the genetic changes that cause or maintain gliomas and may serve as novel treatment targets. [Cancer Res 2007;67(12):56358] | Introduction |
|---|
|
|
|---|
There is strong evidence that aberrant splice isoforms are involved in the initiation and/or progression of glial brain tumors (6). For example, glioblastomas with epidermal growth factor receptor (EGFR) amplification frequently (32 of 48) express EGFRvIII, a tumor-specific, ligand-independent, constitutively active isoform of the EGFR that lacks exons 2 to 7 (7). Expression of this splice variant can induce glioma formation in mice (8) and is associated with response to EGFR kinase inhibitors in human (9). Other, (activating) aberrant EGFR splice variants are also frequently observed in gliomas (10). In addition, many nervous system cancerrelated spice variants were identified using a gene-centric (1116) or bioinformatical approach screening public domain databases (17).
Because aberrant splice isoforms are involved in the initiation and/or progression of glial brain tumors, we initiated a screen to identify splice variants expressed in gliomas. Our screen was done by profiling the expression of virtually all known and predicted exons in the human genome (1.4 million). Splice variants were then calculated from the expression level of exons relative to its transcript. Our results indicate that exon level expression profiling can classify brain tumor subgroups based on their histologic appearance, can identify differentially regulated splice variants, and can identify novel exons.
| Materials and Methods |
|---|
|
|
|---|
Nucleic acid isolation, cDNA synthesis, and array hybridization. Total RNA and genomic DNA was isolated from 20 to 40 cryostat sections of 40-µm thickness (50100 mg) using Trizol (Invitrogen) according to the manufacturer's instructions (see also ref. 18). Total RNA was then further purified on RNeasy mini columns (Qiagen). RNA quality was assessed on a Bioanalyser (Agilent). High-quality RNA (i.e., RNA integrity number >7.0; ref. 19) was used for our experiments. rRNA reduction, first round double-strandcDNA synthesis, cRNA synthesis, second round single-strand (ss)cDNA synthesis, ss-cDNA fragmentation, and labeling was done according to the Affymetrix GeneChip Whole-Transcript Sense TargetLabeling Assay manual. Affymetrix Human Exon 1.0 ST microarrays were hybridized overnight with 5-µg biotinlabeled ss-cDNA.
Data analysis. Signal intensity estimate and P value for each probe set were extracted from the arrays in Affymetrix ExACT 1.0 software using the PLIER and DABG algorithm, respectively. PLIER expression data were normalized using the quantile method in R statistical software v2.2.1. DABG P values allow calculation of false positive and negative probe sets at various PLIER expression level cutoff values. The results are summarized in Supplementary Fig. S1 and show that a PLIER expression level of 30 is close to the cutoff that results in the least amount of falsely called probe sets at DABG P values of <0.05. A higher cutoff level close to PLIER expression 70 seems to result in the least amount of falsely called probe sets at the more stringent DABG P value of <0.01. All values were then imported into Omniviz v3.9 (Omniviz) software for further analysis. For each probe set, the geometric mean of the hybridization intensities of all samples from the patients was calculated with expression values of <30 set to 30 (close to the optimal cutoff with least amount of falsely called probe sets at DABG P value of <0.05).
The expression level of each probe set in every sample was determined relative to the geometric mean and logarithmically transformed (base 2 of scale) to ascribe equal weight to gene expression levels. Deviation from the geometric mean reflects differential probe set expression. Pearson's correlation plots were generated using all probe sets that differed 4-fold from the geometric mean in at least one sample (97175 probe sets in total, Fig. 1 ) or with DABG P < 0.01 in at least five samples (yielding virtually identical similar results, data not shown). Ordering of samples is done according to the algorithm present in Omniviz software as described (20). This method reveals patterns of homologous samples based on Pearson's correlation. The ordering algorithm sorts all samples into correlated blocks through an iterative process and starts with the most highly correlated pair of samples. Each sample is joined to a block, resulting in a correlation trend within a block. The most correlated samples are at the center of each block. The blocks are then positioned along the diagonal of the plot in a similar ordered manner.
|
![]() |
![]() |
Because splice variant detection requires an accurate estimation of metaprobe sets, we used two independent approaches to calculate metaprobe set levels. The first metaprobe set levels were calculated using ExACT 1.0 software based on probe sets determined by Affymetrix. The second metaprobe set calculations required two iterations: We first determined metaprobe set levels by averaging all probe sets with PLIER expression levels >30, >50, or >80. We next hypothesized that differentially spliced exons will result in a metaprobe set level that is lower than when calculated using constitutive exons only. For example, an exon that is spliced out in subgroup A can reduce its metaprobe set level so that constitutive exons are identified as exons that are differentially spliced-out in subgroup B. Therefore, transcript levels should be calculated only using constitutively incorporated (i.e., not differentially spliced between defined subgroups) exons. We defined those constitutive exons (probe sets) as those that are highly correlated (correlation coefficient >0.7, >0.8, or >0.9) with the first round transcript calculations. A total of five metaprobe set calculations were done using cutoff values: (a) PLIER 50, correlation 0.8; (b) PLIER 30, correlation 0.8; (c) PLIER 80, correlation 0.8; (d) PLIER 50, correlation 0.7; and (e) PLIER 50, correlation 0.9. This two-step metaprobe set calculation not only excludes differentially spliced exons but also excludes "nonlinear" probe sets (probe sets that are outside the linear detection range of arrays) and "a-specific" probe sets (probe sets that bear no relation to its transcript).
Statistical analysis was done using standard t tests. Identical filtering and statistical analysis was done on 10 randomized groups to test for type I errors and estimate the false-discovery rate.
Reverse transcription-PCR. Candidate differentially regulated splice variants identified by PAC analysis were analyzed by reverse transcription-PCR (RT-PCR) to confirm differential regulation. All RT-PCR experiments were done on cDNA that was independently reverse transcribed from the cDNA that was used for array hybridization. rRNA (0.5 µg)depleted (ribo-minus RNA) total RNA (the remainder of RNA that was used for array hybridization) was reverse transcribed for 1 h at 42°C in the presence of 200 units of Superscript II, 50 ng T7-(N)6 primers, 0.5 mmol/L deoxynucleotide triphosphates, 10 mmol/L DTT, and RNase inhibitor. Primers were designed using Primer35 and are listed in Supplementary Table S2. Amplified PCR products from novel exon analysis were sequence verified using the Big Dye Terminator Cycle Sequencing kit (Applied Biosystems). Reactions were run on an ABI 3100 genetic analyzer.
| Results |
|---|
|
|
|---|
1.4 million probe sets (a set of up to four oligonucleotide probes that examines the expression of a single exon) are spotted on Human Exon 1.0 arrays: 284,000 core, 523,000 extended, and 580,000 full probe sets. Multiple probe sets may be directed against the same exon, thus, allowing identification of alternative splice-acceptor or splice-donor sites. Exon arrays also allow calculation of whole-transcript levels based on the expression level of probe sets that belong to the same transcript. Calculated transcript levels are called metaprobe set levels. In our experiments, the DABG significant expression (P < 0.01) of 23.7 ± 4.5% of all 1.4 million probe sets were detected. Core exons are detected at higher signal intensities than extended and full exons (Supplementary Fig. S2). Individual sample performance for all array quality control variables is stated in Supplementary Table S3. This platform has thus far not been characterized, and we therefore first validated the performance of these arrays using unsupervised clustering analysis. Unsupervised clustering was done using probe sets with PLIER expression levels of >30 that differed 4-fold from the geometric mean in at least one sample (Fig. 1). A first subgroup (I) consists of all control samples and GBM 77, a sample that contained a low amount (<10%) of tumor. A second subgroup (II) consists of most (20 of 22) of the oligodendrogliomas with LOH on 1p and 19q. The final subgroup (III) predominantly (25 of 27) consists of glioblastomas but also includes two oligodendrogliomas with 1p and 19q LOH (OD20 and OD170). Interestingly, OD20 also did not cluster with the majority of oligodendrogliomas with 1p/19q LOH using expression profiling on HU133 plus 2 microarrays (18). Identical subgroups were identified by principle components analysis, using all core probe sets or core metaprobe sets (Supplementary Fig. S2). Unsupervised clustering therefore indicates that exon expression profiling can identify brain tumor subgroups based on their histologic appearance. Our data therefore confirm the observation that histologically defined glioma subgroups are molecularly distinct (for review, see ref. 21) and indicates that, on a global scale, this novel platform performs similar to other expression profiling platforms.
Identification of differentially regulated splice variants. We next examined whether Human Exon arrays can detect glioma subgroupspecific splice variants. The identification of splice variants was done using PAC. PAC values represent a predicted level of expression for each probe set. Therefore, differences between PAC and expression values are indicative for alternative splicing. Negative values predict that the exon is, compared with the other 53 samples, being spliced out. However, PAC requires a complete linearity of all probe sets within a single transcript: if a transcript is up-regulated 2-fold in one subgroup, all of the probe sets that belong to this transcript should be up-regulated exactly 2-fold. Any probe set that does not exhibit this linearity in expression detection (nonlinear probe sets) or bear no correlation whatsoever with its native transcript (a-specific probe sets) will be identified as a false positive differentially spliced candidate. Examples of such nonlinear and a-specific probe sets are shown in Supplementary Fig. S3. Any strategy to identify differentially expressed splice variants therefore requires filtering out nonlinear and a-specific probe sets.
We adopted two independent strategies to identify candidate splice variants that are differentially regulated between oligodendrogliomas and glioblastomas. In the first strategy, we calculated PAC values for every probe set in all samples using metaprobe sets predetermined by Affymetrix. For our second strategy, we calculated PAC values using recalculated metaprobe set expression levels (see Materials and Methods) with metaprobe set levels (and subsequent PAC values) derived at varying PLIER expression level and/or correlation coefficient cutoff values. We then aimed to exclude nonlinear and a-specific probe sets using the filtering steps outlined in Fig. 2 and Table 1 . These filtering steps resulted in final set of 49 (first strategy) and 254 to 459 candidate differentially regulated splice variants (second strategy). Table 1 summarizes the results at each step in our strategy to identify candidate splice variants. Supplementary Table S4 contains a list of all candidates.
|
|
Altering the variables used for metaprobe set calculation often resulted in significant overlap between candidates identified: many candidates identified at cutoff values PLIER 50, and correlation coefficient 0.8 are also found when the PLIER expression cutoff is reduced to 30 (88%), increased to 80 (83%), or the correlation cutoff is reduced to 0.7 (93%). In contrast, increasing the correlation cutoff to 0.9 results in a set of candidates that contains only 50% of the probe sets identified by PLIER 50 correlation 0.8 with 46 additional probe sets identified.
We did RT-PCR using exon spanning primers to confirm the differential expression of candidate splice variants. RT-PCR was done on 15 candidates from the first screen and 21 candidates from the second screen (PLIER 50, correlation 0.8). RT-PCR candidates were randomly selected from the total number of candidates but omitted candidates with alternative 5'- or 3'-end exons. We confirmed 7 of 15 (47%) from the first screen and 7 of 21 (33%) from the second analysis (Fig. 2). Three of the confirmed candidates were identified in both analysis; the total number of differentially expressed splice variants equaled 11. All differentially expressed splice variants belonged to the core probe set list. Public domain databases (EMSEMBL, UCSC, HOLLYWOOD) also indicated that most (9 of 11) RT-PCR confirmed candidates are subject to alternative splicing. It is possible that the percentage of regulated splice variants is higher than the RT-PCRconfirmed 47% to 33%: rare splice variants or splice variants that show only minor differential regulation may not have been detected by RT-PCR. Nevertheless, our results show that exon level expression profiling can identify splice variants that are differentially regulated between histologically defined subgroups of gliomas.
Identification of novel exons. We finally examined whether Human Exon arrays can be used to identify novel exons. We screened for novel exons using the full probe set list (580,000 probe sets) because all full exons lack evidence for expression in public domain databases. Full probe sets are composed of exons that can be predicted (e.g., based on the presence of consensus splice acceptor and donor sites) and of sequences that are conserved between human, mouse, and rat. Candidate novel exons met the following criteria (see Fig. 3
): (a) show significant expression (PLIER expression levels
5 0); (b) are part of a core metaprobe set as many full probe sets are part of poorly characterized and single-exon transcripts; and (c) should have a high (>0.8) correlation coefficient with its metaprobe set (i.e., the probe set is highly expressed in those samples in which the metaprobe set is highly expressed). These criteria resulted in a final set of 715 full probe sets as candidate novel exons. More candidates are identified using less stringent criteria (exon/transcript correlation
0.7, identifies 1482 full exons). In silico analysis of the first 158 full probe sets confirmed that 127 of 158 (80%) are indeed novel exons; they are not present in the RefSeq database and no spliced EST has thus far been identified. Of the remaining probe sets, 18 of 158 (11%) were incorrectly annotated and are in fact part of a RefSeq gene, and 13 of 158 (8%) were identified as part of (rare) spliced ESTs.
|
RT-PCR also confirmed the expression of 3 of 3 (100%) full exons that, in public domain databases, were part of rare spliced ESTs. All three exons could be identified in all examined samples. For KDHRBS2 and DTNA, RT-PCR was done using exon-spanning primers; for PDE1C, RT-PCR was done with the forward primer in the candidate novel exon because the novel exon may represent a novel 5' exon. Identification of transcripts that have incorporated the novel exon using exon-spanning primers suggests that a significant percentage of transcripts have incorporated the full exon in adult brain (Fig. 3B).
| Discussion |
|---|
|
|
|---|
The molecular subgroups identified using exon level expression profiling is highly similar to the subgroups that are identified in other studies using 3' biased expression profiling (18, 2227). Our data therefore confirm the observation that histologically defined glioma subgroups are molecularly distinct (for review, see ref. 21). Furthermore, the similarity in glial tumor classification indicates that, at least on a global scale, this novel platform performs similar to other expression-profiling platforms.
The additional complexity of exon level expression profiling over transcript-level expression profiling is the ability to identify splice variants that are differentially expressed between tumor subgroups. Our data indicate that the identification of differentially expressed splice variants requires rigorous filtering steps to exclude nonlinear and a-specific probe sets. In the two independent approaches adopted by us, we identified 49 and 254 to 459 candidate splice variants that are differentially expressed between OD and GBM. The list of candidates differs significantly between the two approaches. Furthermore, candidates identified by our second approach (recalculated metaprobe set level) are dependent on the inclusion criteria used to recalculate metaprobeset levels. It remains to be determined which variables are optimal for spice variant detection. However, all candidate lists generated by our second approach contain a similar percentage of known splicing events (
12%; range, 10.413.8%; see Supplementary Table S4) as determined by screening public domain databases on a subset of candidates.
RT-PCR confirmed the differential regulation of a subset of these candidate splice variants. The select number of differentially expressed splice variants identified by us may reflect the similarity in splice variant expression between OD and GBM. Indeed, a limited number (591) of differentially expressed splice variants between mouse brain and immune tissue were identified by Ule and coworkers using exon-junction arrays (28). In contrast, experimental evidence exists for the regulated expression of a large number of splice variants: many splice variants show some degree of tissue specificity (2931). It is therefore also possible that the strong filtering used in this study has led to the identification of only a subset of differentially regulated splice variants.
The differential expression of splice variants between two tumor subtypes may be caused by a differential expression of proteins that regulate alternative splicing. Indeed, a large number of proteins have been identified to play a role in the regulation of alternative splicing (for review, see refs. 3234). However, the expression of glioma subgroupspecific splice variants may also be a result of genetic changes. For example, glioblastomas with EGFR amplifications frequently carry an intragenic deletion of exons 2 through 7, resulting in expression of the tumor specific, constitutively active EGFRvIII isoform (35). Such aberrant splice isoforms have been shown to play a role in the initiation and/or progression of glial brain tumors (6). Identifying glioma-specific splice variants may therefore help identify the causative genetic changes of glial brain tumors.
Apart from exon expression arrays, other techniques have been used to analyze splice variant expression. These include exon-junction arrays (36), RNA-mediated annealing, selection and ligation (37) and digital polony (polymerase colony) exon profiling (38). Recently, arrays containing a combination of exon expression and exon junction probes have also been used to identify alternative splicing events (39, 40). Although all approaches can detect alternative splicing events, many are limited either by screening on a predetermined set of exon-junctions or screening on a per-gene base. Our data shows that exon expression profiling is a suitable alternative for genome-wide screening of regulated splicing events between two distinct subgroups.
Our study has also identified 715 full exons that are expressed as part of a well-annotated transcript. In silico analysis (screening public domain databases) of a subset of candidates indicated that 80% are indeed novel exons; they are not present in the RefSeq database and no spliced EST has thus far been identified. We confirmed the expression of
67%, suggesting a total of
446 (0.78*0.8*715) novel exons are expressed as part of a well-annotated transcript. Candidates that were not confirmed by RT-PCR (33%) may be falsely identified, for example when the exon array detects unspliced, pre-mRNA species (see e.g., ref. 41). The majority (5 of 6) of RT-PCR confirmed novel exons are expressed in normal adult human brain, indicating they are not aberrant, cancer-specific splice isoforms. Furthermore, most (5 of 6) of the RT-PCR confirmed novel exons result in changes at the protein level: the novel exons are often found within the protein coding region.
Many of the full probe sets on the Human Exon arrays are based on evolutionary sequence conservation between human, mouse, and rat. Other studies have also found novel exons based on such sequence conservation. For example,
150 candidate novel human exons were identified in a screen based on the expression of ESTs in mouse/rat (42). Furthermore, a bioinformatical approach using sequence conservation has identified up to 2,300 novel, rodent-specific exons (43). In a separate study, bioinformatical analysis based on exon expression profiles from adult mouse tissue has suggested the presence of a large number (4070,000) of novel exons (44). Although our study identified fewer novel exons, both studies argue for the presence of novel exons in human/mouse genomes and that such novel exons can be identified using exon expression profiling.
In summary, our results indicate that exon level expression profiling can be used to molecularly classify brain tumor subgroups, can identify differentially regulated splice variants, and can identify novel exons. The splice variants identified by exon level expression profiling may lead to the identification of causative genetic changes in glial brain tumors. Furthermore, glioma-subgroup specific splice variants may serve as novel treatment targets.
| Acknowledgments |
|---|
| Footnotes |
|---|
4 CBTRUS 20042005 statistical report (www.cbtrus.org). ![]()
5 http://frodo.wi.mit.edu/cgi-bin/primer3/primer3_www.cgi ![]()
Received 8/ 2/06. Revised 3/ 9/07. Accepted 3/26/07.
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
D. Gaidatzis, K. Jacobeit, E. J. Oakeley, and M. B. Stadler Overestimation of alternative splicing caused by variable probe characteristics in exon arrays Nucleic Acids Res., September 1, 2009; 37(16): e107 - e107. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Klinck, A. Bramard, L. Inkel, G. Dufresne-Martin, J. Gervais-Bird, R. Madden, E. R. Paquet, C. Koh, J. P. Venables, P. Prinos, et al. Multiple Alternative Splicing Markers for Ovarian Cancer Cancer Res., February 1, 2008; 68(3): 657 - 663. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. J. Moore and P. A. Silver Global analysis of mRNA splicing RNA, February 1, 2008; 14(2): 197 - 203. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Ben-Dov, B. Hartmann, J. Lundgren, and J. Valcarcel Genome-wide Analysis of Alternative Pre-mRNA Splicing J. Biol. Chem., January 18, 2008; 283(3): 1229 - 1233. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| Cancer Research | Clinical Cancer Research |
| Cancer Epidemiology Biomarkers & Prevention | Molecular Cancer Therapeutics |
| Molecular Cancer Research | Cancer Prevention Research |
| Cancer Prevention Journals Portal | Cancer Reviews Online |
| Annual Meeting Education Book | Meeting Abstracts Online |