MicroRNAs (miRNA) are small noncoding RNA molecules involved in a diversity of cellular functions. Although it has been reported that global suppression of the miRNA biogenesis pathway leads to enhanced tumorigenesis, the effect of common genetic variants of miRNA-related genes on cancer predisposition is unclear. To better understand this effect, we genotyped 41 single-nucleotide polymorphisms (SNP) from 24 miRNA-related genes in a case-control study conducted in 746 Caucasian patients with bladder cancer and 746 matched controls. The homozygous variant genotype of a nonsynonymous SNP in the GEMIN3 gene (rs197414) was associated with a significantly increased bladder cancer risk [odds ratios (OR), 2.40; 95% confidence interval (95% CI), 1.04–5.56]. Several additional miRNA-related SNPs were also identified that showed a borderline significant association with bladder cancer risk. Haplotype analysis indicated that a common haplotype of the GEMIN4 gene was associated with a significantly increased bladder cancer risk with an OR of 1.25 (95% CI, 1.01–1.54). To assess the aggregate effects of the promising SNPs, we performed a combined unfavorable genotype analysis that included all SNPs showing at least a borderline statistical significance. We found that, compared with the low-risk reference group with less than two unfavorable genotypes, the medium-risk group with two unfavorable genotypes exhibited a 1.29-fold (0.92–1.81) increased risk whereas the high-risk group with more than two unfavorable genotypes exhibited a 1.92-fold (1.36–2.71) increased risk (Ptrend < 0.0001). Overall, this is the first epidemiologic study showing that miRNA-related genetic variants may affect bladder cancer risk individually and jointly. [Cancer Res 2008;68(7):2530–7]
- bladder cancer
MicroRNAs (miRNA) are a group of endogenous, small, noncoding RNA molecules of ∼22 nucleotides ( 1). To date, more than 500 human miRNAs have been recorded in the miRBase registry, with the total number predicted to be at least 800 ( 2, 3). It has been conjectured that miRNAs regulate the expression of approximately one third of human genes ( 4, 5). The interaction of miRNAs and target genes is intricately regulated, in that one miRNA may modulate multiple target genes whereas one target gene may be regulated by various miRNAs.
MiRNAs negatively affect the expression level of their target genes through two distinctive mechanisms, depending on the degree of their complementarity to target sequences ( 6). In the first mechanism, a perfect or near-perfect match between miRNAs and their binding sequences within the 3′ untranslated regions (UTR) of their target mRNAs induces the RNA-mediated interference pathway. The RNA-induced silencing complex then recognizes the miRNA-mRNA interaction and cleaves the mRNA through an endonuclease activity. In the second mechanism, miRNAs control gene expression at the translational level through imperfect target matching ( 6, 7).
MiRNAs have been implicated in a wide diversity of basic cellular functions, such as organ development ( 8), insulin secretion ( 9), muscle differentiation ( 10), immune response ( 11), and cardiac regulation ( 12). Moreover, miRNAs have been extensively associated with the etiology and clinical outcome of human cancers ( 6, 13, 14). MiRNAs influence tumorigenesis through their regulation of specific proto-oncogenes and tumor suppressor genes. For instance, the let-7 miRNA inhibits lung tumorigenesis by repressing the expression of the RAS oncogene ( 15). The transcriptional factor E2F1 is also negatively regulated by two miRNAs, miR-17-5p and miR-20a, in the polycistronic mir-17-92 cluster, the expression of which is controlled by the c-MYC onco-protein ( 16). In addition, by inhibiting the expression of the LATS2 tumor suppressor gene, mir-372 and mir-373 promote the transformation of human testicular germ cell tumors ( 17). Moreover, the large-scale profiling of miRNA expressions using microarray or real-time PCR techniques has revealed significant associations between miRNA expression signatures and the etiology, early diagnosis, molecular classification, and prognosis of various cancers ( 4, 18– 20).
MiRNAs are generated in a precisely coordinated two-step pathway. Most miRNAs reside in intergenic or intronic regions and are transcribed as a part of a long transcript through RNA polymerases II ( 6). These primary miRNA transcripts (pri-miRNA) are processed in the nucleus by the microprocessor machinery, which contains the Drosha RNase and the double-strand RNA binding protein DGCR8 ( 21). A hairpin precursor miRNA molecule of 70 to 100 nucleotides (pre-miRNA) is then produced, which translocates to the cytoplasm through the assistance of RAN GTPase and Exportin 5 (XPO5), where it is further processed by a protein complex that includes DICER, TRBP, AGO1, and AGO2, leading to the production of mature miRNAs ( 22, 23). The global or specific deregulation of key genes in the miRNA biogenesis pathway has been associated with malignant transformation ( 6, 22, 24).
Although genetic polymorphisms have been widely implicated in cancer development and treatment response ( 25, 26), such evidence is lacking for the miRNA-related genes. SNPs in miRNA-containing genomic regions have been reported to be rare and unlikely to be functionally important ( 27, 28). However, Duan et al. ( 29) identified a SNP within the seed region of miR-125 that significantly affected the production of miR-125a precursors. A SNP in the precursor stem-loop of miR-K5, a miRNA encoded by the human Kaposi's sarcoma-associated herpesvirus, has also been found to influence Drosha processing ( 30). These lines of evidence suggest that sequence variations affect the expression or function of their host miRNAs. However, as yet, no study has shown an association between the polymorphisms in the miRNA biogenesis pathway genes and cancer incidence.
In this case-control study, we tested the hypothesis that common sequence variants in genes of miRNA and of the miRNA biogenesis pathway affect bladder cancer susceptibility. We used a polygenic approach to evaluate the haplotypes and combined effects of 41 potentially functional miRNA-related SNPs on bladder cancer risk. To our knowledge, this is the first study of the involvement of polymorphisms in miRNA processing pathway genes in cancer predisposition.
Materials and Methods
Study population and epidemiologic data. The study population has been described elsewhere ( 26). Briefly, the population consisted of patients with newly diagnosed and histopathologically confirmed bladder cancer accrued at The University of Texas M. D. Anderson Cancer Center and Baylor College of Medicine. The control subjects were selected from a large control pool recruited through a collaboration with the Kelsey-Seybold Clinic, the largest private multi-specialty physician group in the Houston metropolitan area, consisting of 23 clinics and more than 300 physicians. The potential control subjects were identified through a short questionnaire during registration when they visit the Kelsey Seybold Clinic to determine their willingness to participate in the study and to provide preliminary demographic data for matching purposes. They were then contacted by telephone at a later date to confirm their willingness to participate and to schedule an interview at a Kelsey-Seybold clinic. On the day of the interview, the potential control subject came to the clinic specifically for the purpose of study participation. Controls had no cancer history (except nonmelanoma skin cancer) and were frequency matched to cases on age, gender, and ethnicity. This control selection strategy has been well described and proved to be feasible and effective for molecular epidemiologic studies in which population-based control selection poses a practical challenge ( 31). All subjects were interviewed using a structured questionnaire. Each participant had a 40-mL blood sample drawn into a coded, heparinized tube, which was sent to the laboratory for immediate molecular analysis. Laboratory personnel were blinded to case-control status. All participants signed written informed consent forms, and human subject approval was obtained from both M. D. Anderson and Kelsey-Seybold institutional review boards.
Selection of genes and polymorphisms. Through an extensive mining of the databases of the International HapMap Project ( 32), dbSNP ( 33), and miRBase registry ( 3), we identified 41 potential functional polymorphisms: 24 SNPs in eleven genes in the miRNA biogenesis pathway, 7 SNPs in seven pre-miRNAs, and 10 SNPs in eight pri-miRNAs ( Table 1 ). All SNPs have a reported minor allele frequency (MAF) of >0.01 in Caucasians. In the miRNA biogenesis pathway, except for two AGO1 SNPs (rs636832 and rs595961) located in introns, all other polymorphisms reside in functional regions, including exons, UTRs, and promoters (within 2 kb of the genes). In the case of multiple potentially functional SNPs within the same haplotype block (defined by the linkage coefficient r2 > 0.8), only one SNP was included. Except for GEMIN4 rs7813, none of these SNPs have been reported in previous studies. All SNPs identified from the pre-miRNA regions were included if the MAF was >0.01 in Caucasians. For SNPs in pri-miRNAs but not in pre-miRNAs, because we identified more than 200 such SNPs with an MAF of >0.01 in Caucasians, we included 10 SNPs from eight pri-miRNAs whose mature counterparts have been extensively implicated in cancer etiology or clinical outcome.
Genotyping. All polymorphisms were genotyped using the SNPlex assay according to the manufacturer's instructions (Applied Biosystems). SNPlex is a high-throughput genotyping approach that makes use of a set of preoptimized, universal assay reagents together with customized probe sets to generate genotyping data through oligonucleotide ligation, exonuclease purification, PCR amplification, probe hybridization, and capillary electrophoresis ( 34). Briefly, we customized and submitted a list of miRNA-related SNPs to Applied Biosystems. Based on this list, a pool of allele-specific oligonucleotide (ASO) probes and locus-specific oligonucleotide (LSO) probes was designed. Genomic DNA was fragmented at 99°C for 10 min and hybridized to a mixture of the probe pool and universal linkers that contain universal PCR primer-binding sequences, as well as sequences complementary to ASO and LSO probes. On perfect matching to the sequence at the target SNP site in the genomic DNA, ASO and LSO probes were ligated and the universal linkers were linked to the distal termini of the ASO and LSO probes. The mixture of unligated probes, linkers, and genomic DNAs was purified after exonuclease enzymatic digestion and amplified by PCR using a pair of universal PCR primers, one of which was biotinylated. The biotinylated amplicons were denatured and bound on streptavidin-coated microtiter plates. After the removal of nonbiotinylated strands, the single-stranded PCR amplicons were hybridized with a set of fluorescently labeled, mobility-modified ZipChute probes (Applied Biosystems), eluted into capillary electrophoresis buffer, and analyzed on an Applied Biosystems 3730 DNA Analyzer. Genotypes were called by GeneMapper software (Applied Biosystems) using a template file provided with each custom SNPlex assay. Internal quality controls and negative controls were used to ensure genotyping accuracy, and 5% of all samples were randomly selected and genotyped in duplicate with 100% concordance.
Statistical analysis. Due to the small number of minority participants, we limited all our analyses to Caucasians. Statistical analyses were done using Intercooled STATA software (STATA Corp.). χ2 analysis was used to assess the differences between cases and controls with regard to categorical variables such as gender and smoking status. Student's t test was used to test for continuous variables, including age and pack-years. The Hardy-Weinberg equilibrium was tested using a goodness-of-fit χ2 analysis. The bladder cancer risks were estimated as odds ratios (OR) and 95% confidence intervals (95% CI) using unconditional multivariate logistic regression adjusted for age, gender, smoking status, and pack-years, where appropriate. The definitions of smoking status were the same as those that have previously been described ( 35). Haplotypes were inferred using the expectation-maximization algorithm implemented in the HelixTree software (Golden Helix, Inc.). Haplotypes with a probability of <95% were excluded from the final analysis. The adjusted OR and 95% CI for each haplotype were assessed using multivariate logistic regression under a 1 degree of freedom model that, for each haplotype, combines all other haplotypes as the reference group. The unfavorable genotype analysis included those SNPs showing at least a borderline statistical significance in the main analysis. The unfavorable genotypes were collapsed together and categorized according to the tertiles (low, medium, and high risk) of the number of unfavorable genotypes in controls. Using the low-risk group as the reference group, we calculated the ORs and 95% CIs for the medium-risk and high-risk groups using unconditional multivariate logistic regression adjusted for age, gender, and smoking status. All P values were two sided. P ≤ 0.05 was considered the threshold of statistical significance.
Characteristics of the study population. The final study population consisted of 1,492 Caucasians, composed of 746 bladder cancer patients and 746 cancer-free controls ( Table 2 ). No significant differences were identified between cases and controls with regard to age [cases versus controls (mean ± SD), 63.5 ± 10.9 versus 63.3 ± 10.6 years; P = 0.70] and gender (P = 1.00). As expected, there was a significantly higher percentage of ever smokers among cases (73.5%) than among controls (55.1%; P < 0.001). Among ever smokers, cases also reported a significantly greater cigarette consumption than did controls, as assessed by the mean number of pack-years (cases versus controls, 42.7 ± 29.9 versus 30.3 ± 28.1; P < 0.001).
Main effects and stratified analyses by individual polymorphisms. The associations of the 41 SNPs with bladder cancer risk are listed in Supplementary Table S1. The genotyping completion rate ranged from 90% to 99% for all SNPs except for DROSHA rs10719 (65%) and was similar between cases and controls (average rate, 96.5% for cases and 97.2% for controls). Three SNPs (DROSHA rs10719, RAN rs14035, and let7f-2 rs17276588) showed a significant deviation from the Hardy-Weinberg equilibrium and were excluded from further analyses. In all, seven SNPs exhibited at least a borderline significant association with bladder cancer risk under either a dominant model (variant-containing genotypes versus homozygous wild-type genotype) or a recessive model (homozygous variant genotype versus wild-type-containing genotypes; Table 3 and Supplementary Table S1). Among them, the AA genotype of GEMIN3 rs197414 was associated with a 2.5-fold (95% CI, 1.08–5.78; P = 0.03) increased risk when compared with the combined CC/CA genotypes. In stratified analyses, this risk remained significant in young subjects (OR, 3.19; 95% CI, 1.00–10.19; P = 0.05) and light smokers (OR, 2.97; 95% CI, 1.01–24.31; P = 0.05; Table 4 ). Moreover, an altered risk association was also identified for TRBP rs784567 in young subjects (OR, 0.69; 95% CI, 0.48–0.98; P = 0.04) and ever smokers (OR, 0.74; 95% CI, 0.54–1.00; P = 0.05), for mir423 rs6505162 in males (OR, 1.34; 95% CI, 1.00–1.79; P = 0.05), for mir492 rs2289030 in females (OR, 2.67; 95% CI, 1.26–5.62; P = 0.01), for mir26a-1 rs7372209 in females (OR, 0.36; 95% CI, 0.13–0.94; P = 0.04), and for mir124-1 rs531564 in old subjects (OR, 4.85; 95% CI, 1.02–23.01; P = 0.05).
Haplotype analyses. Table 5 summarizes the relative risks associated with the common haplotypes of genes in this study. The only haplotype associated with bladder cancer risk was the H3 (WMMMWWW; W, wild-type allele; M, variant allele) haplotype of the GEMIN4 gene consisting of a promoter SNP and six nonsynonymous SNPs in the following order: rs910924, rs2740348, rs7813, rs910925, rs3744741, rs1062923, and rs4968104. Compared with the reference group combining all other GEMIN4 haplotypes, this haplotype was associated with a 1.25-fold (95% CI, 1.01–1.54; P = 0.04) increased bladder cancer risk.
Combined effects of the unfavorable genotypes. We further evaluated the combined effects of the high-risk genotypes on bladder cancer risk by collapsing the unfavorable genotypes of the seven risk-conferring SNPs shown in Table 3. We found a progressively increased gene-dosage effect when the subjects were grouped on the basis of an increasing number of unfavorable genotypes ( Table 6 ). That is, compared with the low-risk group consisting of subjects with less than two unfavorable genotypes, the medium-risk group with two unfavorable genotypes was at a 1.29-fold (95% CI, 0.92–1.81; P = 0.14) increased risk whereas the high-risk group with more than two unfavorable genotypes was at a 1.92-fold (95% CI, 1.36–2.71; P < 0.0001) increased risk (Ptrend < 0.0001; Table 6).
In this study, in which we assessed the effects of 41 SNPs in genes of the miRNA biogenesis pathway, pre-miRNAs, and pri-miRNAs on bladder cancer predisposition, we found that a nonsynonymous SNP in GEMIN3 and a common haplotype of GEMIN4 were associated with a significantly increased bladder cancer risk. We also identified several additional miRNA-related SNPs showing a borderline significant association with bladder cancer risk. In addition, we showed that the combined unfavorable genotypes of selected SNPs might be used jointly to predict bladder cancer risk. To our knowledge, this is the first study to evaluate the associations of miRNA-related polymorphisms and cancer susceptibility.
A few studies have been done that examined sequence variations in miRNA regions. For example, Iwai and Naraba ( 36) sequenced 173 human pre-miRNA genomic regions in 96 subjects and identified 10 polymorphisms. They suspected that an A/C SNP in the mature miR-30c-2 may alter target gene selection and thus have biological consequences ( 36). Through an extensive database interrogation, Saunders et al. ( 27) identified 65 SNPs in 474 pre-miRNAs. However, many of these SNPs may not actually be important to population genetics due to the lack of frequency data ( 27), which is consistent with the notion that genetic variants in pre-miRNA regions are scarce and unlikely to be functionally relevant, possibly due to the constraint imposed by natural selection on the evolutionarily conserved pre-miRNA sequences ( 27, 28). In contrast, Duan et al. ( 29) identified 323 SNPs in 227 human miRNAs, among which 12 were localized within precursor regions. One SNP in the seed region of mature miR-125a was found to be essential to the accurate recognition of target mRNA sequences. In vivo functional characterization further revealed that this SNP significantly blocked the maturation of miR-125a ( 29). Additional evidence of the physiologic relevance of miRNA polymorphisms came from the study of Gottwein et al. ( 30), who found that a miRNA precursor SNP influences Drosha processing. Taken together, these findings suggested that genetic variations affect the production or function of the host miRNAs. However, to date, there have not been any studies investigating the relevance of the polymorphisms in the miRNA biogenesis pathway genes to cancer risk.
In our study, the GEMIN3 and GEMIN4 genes were found to be associated with bladder cancer risk. Both proteins are core components of a large macromolecular complex that interacts with the survivor or motor neuron protein and plays an essential role in pre-mRNA splicing and ribonucleoprotein assembly ( 37, 38). Mourelatos et al. ( 39) found that the GEMIN3 and GEMIN4 proteins are also present in a 15S ribonucleoprotein complex containing eIF2C, a member of the AGO protein family pivotal to miRNA processing. The additional identification of numerous miRNAs in this complex ( 39), concordant with several other independent observations ( 40– 42), strongly suggests the involvement of GEMIN proteins in the processing of miRNA precursors through their interaction with key components of the RNA-induced silencing complex ( 42, 43). No functional effect on GEMIN3 rs197414 has been reported. However, Wan et al. ( 44) found that the variant allele of GEMIN4 rs7813 was associated with a significant in vitro growth-inhibitory effect in hepatocellular carcinoma cell lines as compared with the wild-type allele, suggesting that the amino acid change caused by this SNP might have physiologic significance. However, whether this SNP produces the altered risk through a similar mechanism in bladder cancer needs to be investigated in an in vivo setting.
Besides the significant SNP/haplotype identified for the GEMIN genes, borderline significant associations with bladder cancer risk were also identified for SNPs in several other genes, including TRBP, mir423, mir492, mir26a-1, and mir124-1 ( Table 3). In particular, the variant allele of rs784567, which is located in the 5′ UTR of the TRBP gene, was associated with a 20% risk reduction (P = 0.07). TRBP is an integral component of a DICER-containing complex important to the cytoplasmic processing of pre-miRNAs into mature miRNAs ( 23). TRBP recruits the DICER complex to the Argonaute 2 protein, which is the catalytic unit of the RNA-induced silencing complex. Depletion of TRBP was found to result in significantly impaired miRNA biogenesis ( 23). mir423 and mir492 have been found to possess a high-frequency SNP in the pre-miRNA region. We identified fewer than 10 miRNA genes with polymorphisms in the pre-miRNA region, consistent with previous findings indicating the elimination of genetic variants by natural selection pressure in this area ( 27– 29). mir26a-1 is localized to chromosome 3p21, a region frequently deleted in several epithelial cancers ( 28, 45). Loss of expression of mir124-1 as a result of CpG island hypermethylation has also been observed in multiple cancer cell lines and is associated with the dysregulation of the oncogene CDK6 and the tumor suppressor gene RB ( 46). In addition, the expression levels of both mir26a-1 and mir124-1 are down-regulated in lung cancer ( 19). To further explore the implications of our identified miRNAs in bladder tumorigenesis, we used the TargetScan program ( 47) to identify a list of candidate transcripts targeted by each of these miRNAs. The number of transcripts ranges from 11 (mir423) to 773 (mir124-1; Supplementary Table S2). By using a normal bladder tissue microarray database (accession no. GSM44682) in the National Center for Biotechnology Information Gene Expression Omnibus ( 48), we found that, for each miRNA, ∼50% of the targeted transcripts are expressed in normal bladder tissues (range, 41–55%). Moreover, 43% to 55% of these expressed transcripts exhibited differential expression patterns between normal and tumor bladder tissues (ref. 49; Supplementary Table S2). Some of these target genes have been implicated in bladder tumorigenesis. For instance, the PTEN gene is identified by TargetScan as a putative target of mir26a-1. Mutations of PTEN have been reported in a wide variety of tumors including bladder cancer ( 50). If the expressions of these miRNAs in bladder tissues are recognized and the associations between these SNPs and bladder cancer risk are validated, the next key question would be which gene(s) are the targets of these miRNAs in the development of bladder cancer.
There is a possibility that the associations observed in this study were attained by chance, given the loss of robustness after multiple comparison adjustments (data not shown). Therefore, it is likely that the risk-conferring effect of any single polymorphisms may only be minimal. To more powerfully elucidate the influences of miRNA polymorphisms on bladder cancer risk, we used a pathway-based polygenic strategy in which we collapsed all the unfavorable genotypes with at least a borderline significant association to assess their combined effects on tumorigenesis. Through such an approach, we identified a trend toward an increasing bladder cancer risk with an increasing number of unfavorable genotypes that occurred in a dose-dependent manner. This finding reinforces the notion that bladder cancer is a polygenic process and thus a combined analysis of multiple factors may have a greater ability to characterize high-risk populations.
The strength of our study includes our use of a large and homogeneous population with strict matching criteria to eliminate the potential confounding effects of age and gender. We have also constructed a comprehensive catalogue of potentially functional SNPs in most currently known miRNA biogenesis genes. This list can be readily used by independent researchers for replication studies of different cancer sites. Nevertheless, it is likely that some associations we presented here are chance findings. Further epidemiologic and functional studies are needed to validate these results.
Overall, we present the first epidemiologic evidence supporting the involvement of genetic polymorphisms in miRNA genes and miRNA biogenesis pathway genes in cancer development. Our results suggest that individual as well as combined genotypes of miRNA-related variants may be used to predict the risk of bladder cancer.
Grant support: National Cancer Institute grants CA 74880 and CA 91846.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Note: Supplementary data for this article are available at Cancer Research Online (http://cancerres.aacrjournals.org/).
- Received October 25, 2007.
- Revision received January 4, 2008.
- Accepted January 10, 2008.
- ©2008 American Association for Cancer Research.