A two-stage genome-wide association study (GWAS) of the Cancer Genetic Markers of Susceptibility (CGEMS) initiative identified single nucleotide polymorphisms (SNP) in 150 regions across the genome that may be associated with prostate cancer (PCa) risk. We filtered these results to identify 43 independent SNPs where the frequency of the risk allele was consistently higher in cases than in controls in each of the five CGEMS study populations. Genotype information for 22 of these 43 SNPs was obtained either directly by genotyping or indirectly by imputation in our PCa GWAS of 500 cases and 500 controls selected from a population-based case-control study in Sweden [Cancer of the Prostate in Sweden (CAPS)]. Two of these 22 SNPs were significantly associated with PCa risk (P < 0.05). We then genotyped these two SNPs in the remaining cases (n = 2,393) and controls (n = 1,222) from CAPS and found that rs887391 at 19q13 was highly associated with PCa risk (P = 9.4 × 10−4). A similar trend of association was found for this SNP in a case-control study from Johns Hopkins Hospital (JHH), albeit the result was not statistically significant. Altogether, the frequency of the risk allele of rs887391 was consistently higher in cases than controls among each of seven study populations examined, with an overall P = 3.2 × 10−7 from a combined allelic test. A fine-mapping study in a 110-kb region at 19q13 among CAPS and JHH study populations revealed that rs887391 was the most strongly associated SNP in the region. Additional confirmation studies of this region are warranted. [Cancer Res 2009;69(7):2720–3]
- prostate cancer
Genome-wide association study (GWAS) has been an effective tool to identify genetic variants associated with disease risk without any presumption about their location or function. More than a dozen prostate cancer (PCa) risk-associated variants have been identified from GWAS and consistently replicated in multiple independent study populations ( 1– 10). These newly discovered PCa risk-associated variants may provide novel insight into disease etiology. It is anticipated that results from GWAS will lead to better prediction of PCa risk for early detection and better understanding of the molecular mechanisms of this disease.
Using a two-stage design GWAS among a total of 5,113 PCa patients and 5,121 control subjects from five study populations, the Cancer Genetic Markers of Susceptibility (CGEMS) study identified 150 distinct regions that were potentially associated with PCa risk (P < 10−3; ref. 8). Among these 150 regions, five reached genome-wide significance (P < 10−8), including two at 8q24 and one each at 17q12, 10q11, and 11q13. The associations at these five regions have been reported in other GWAS ( 1– 5, 9). Two additional regions did not reach genome-wide significance but were highly significant, including 10q26 (P = 10−7) and 7p15 (P = 10−6). For these seven regions, risk alleles of single nucleotide polymorphisms (SNP) were consistently more common in cases than controls among all five study populations. In the current study, we examined SNPs in the remaining 143 regions and found that 43 SNPs had this same consistency. We then sequentially examined these 43 SNPs in two additional study populations from Sweden and the United States.
Materials and Methods
Study subjects. The Cancer of the Prostate in Sweden (CAPS) study has been described in detail ( 11), including 2,899 cases and 1,722 controls. Case subjects were classified as having aggressive disease if they met any of the following criteria: T3/4, N+, M+, Gleason score sum ≥8, or prostate-specific antigen (PSA) >50 ng/mL; otherwise, they were classified as having nonaggressive disease (Supplementary Table S1A). We selected 500 aggressive PCa cases and 500 controls matching the age distribution of cases for a GWAS ( 6). The sample size for the GWAS was determined based on available funds and statistical power; we had 80% power at a genome-wide significance level (P < 2.5 × 10−8) to detect a risk allele with odds ratio (OR) ≥1.9 and minor allele frequency (MAF) ≥0.2. No evidence for potential population stratification in the GWAS samples was observed using the D statistic of the Kolmogorov-Smirnov test ( 6). The study received institutional approval at the Karolinska Institutet, Umeå University.
The Johns Hopkins Hospital (JHH) study population was described in detail elsewhere ( 12– 14), including 1,527 cases and 482 controls of European descent (by self-report). Tumors with a Gleason score of 7 or higher or stage pT3 or higher or N+ or M1 (i.e., either high-grade or non–organ-confined disease) were defined as more aggressive (Supplementary Table S1B). The study received institutional approval.
We also used the published data from the National Cancer Institute CGEMS study ( 4, 8). Summary genotype information from the five study populations was included in this study. The five study populations are the Prostate, Lung, Colon and Ovarian (PLCO) Cancer Screening Trial, American Cancer Society Cancer Prevention Study II (CPS-II); the Health Professionals Follow-up Study (HPFS); CeRePP French Prostate Case-Control Study (FPCC); and Alpha-Tocopherol, Beta-Carotene Cancer Prevention Study (ATBC).
Genotyping. Methods for the GWAS in 500 cases and 500 controls were described in detail elsewhere ( 6). The average genotyping call rate (i.e., the number of SNPs being called by BRLMM algorithm/total number of SNPs) was 99.1%. Genotype concordance for the duplicated samples was >99%. We found that 260,852 SNPs (53.23%) met the quality control criteria of MAF ≥0.01, Hardy-Weinberg equilibrium (HWE) >10−4 in controls, and genotyping call rate >95% in cases and controls. These SNPs were selected for further analysis and imputation.
For confirmation and fine-mapping studies, SNPs were genotyped using iPLEX (Sequenom, Inc.). The primer information is available on request. The rate of concordant results between 100 duplicate samples was >99%.
Statistical methods. Tests for HWE were performed for each SNP separately among case patients and control subjects using Fisher's exact test. Haplotype blocks were estimated using a computer program Haploview ( 15), and a default Gabriel method ( 16) was used to define each haplotype block.
We imputed all of the known SNPs in the 110-kb region of interest at 19q13 based on the genotyped SNPs and haplotype information in the HapMap phase II data (CEU) using a computer program, IMPUTE ( 17). A posterior probability of 0.9 was used as a threshold to call genotypes. Imputed SNPs (n = 32) that had a call rate >90% in both CAPS and JHH were included in the following analysis.
Allele frequency differences between case patients and control subjects were tested for each SNP using a χ2 test with 1 degree of freedom. Allelic OR and 95% confidence interval (95% CI) were estimated based on a multiplicative model.
Associations of SNP rs887391 with aggressiveness of PCa (advanced or localized), Gleason score (≤ 6, 7, or ≥8), and family history (yes or no) were tested only among case subjects with the use of a χ2 test of a 3×K table with 2×(K-1) degrees of freedom, in which K is the number of possible categories within each variable. Serum PSA level was log transformed to approximate the distributional assumption. A test for trend was used to assess the association between log PSA level and the number of risk allele carriers (0, 1, and 2) using the linear regression model. Association of SNP rs887391 with the mean age at diagnosis was tested only among case subjects with the use of analysis of variance (ANOVA).
Results and Discussion
We filtered the SNPs in 150 regions identified from the two-stage GWAS of the CGEMS study using criteria that the direction of association be consistent among all five CGEMS study populations (i.e., the frequency of the risk allele was higher in cases than in controls in each of these study populations; Supplementary Fig. S1; Supplementary Table S2). This resulted in the identification of 43 SNPs for further study. As a first-stage confirmation, we cross-checked these 43 SNPs in an independent GWAS performed in the CAPS population. High-quality genotyping data were available for 22 of these 43 SNPs (Supplementary Table S2, top), including two SNPs that were directly genotyped in the Affymetrix 500K SNP arrays and 20 SNPs that could be successfully imputed, with a missing call rate <10%. Two imputed SNPs from two distinct regions were significantly associated with PCa risk using a χ2 test: rs887391 at 19q13 (nominal P = 0.03) and rs6922172 at 6p12 (nominal P = 0.04). The direction of association in both SNPs was consistent with that of the five CGEMS study populations.
As a second-stage confirmation, we genotyped these two SNPs in the remaining CAPS study subjects, including 2,393 PCa patients and 1,222 control subjects. SNP rs6922172 at 6p12 was not significant (P = 0.23). However, a highly significant association was found for the SNP rs887391 at 19q13. The frequency of risk allele “T” was significantly higher in cases (0.76) than in controls (0.73; P = 9.4 × 10−4). As a third-stage confirmation, we genotyped this SNP in 1,527 PCa patients and 482 control subjects of European descent from JHH. The risk allele T was more common in cases (0.79) than controls (0.78), although the difference was not significant (P = 0.43). Combining all the available data from the CAPS, JHH, and five populations of the CGEMS study using a Mantel-Haenszel method, the overall P value of the allelic test was 3.2 × 10−7 for the SNP ( Table 1 ). This P value almost reached genome-wide significance of 9.5 × 10−8 for 5% type I error of all tested SNPs in the genome. The OR for allele T was estimated to be 1.15, with a 95% CI of 1.09 to 1.21. Notably, although the risk alleles were consistently higher in cases than controls in all examined populations, the difference was not significant in several individual study populations, likely due to limited statistical power to detect risk SNPs with moderate effect in a small study. Our study showed the advantage of combining information from several small studies to detect such risk SNPs.
We next performed a fine-mapping study in CAPS and JHH to assess associations of other SNPs at 19q13 with PCa risk. A 110-kb region (46,630,000–46,740,000 bp) was identified based on the CAPS GWAS, where SNPs with P < 0.05 were aggregated. We selected 14 tagging SNPs to cover the fine-mapping region based on HapMap phase II data. These SNPs were genotyped among all CAPS and JHH study subjects. We also imputed 32 SNPs based on the HapMap phase II data (CEU; ref. 17). Allele frequency differences between cases and controls were tested using a χ2 test for these 46 SNPs in CAPS and JHH (Supplementary Table S3). A combined test was performed for each SNP using a Mantel-Haenszel method ( Fig. 1A ). SNP rs887391 was the strongest PCa risk-associated SNP in this region. SNPs associated with PCa risk at P < 0.01 spanned ∼62 kb, from 46,677,427 to 46,739,764, and were located in four haplotype blocks ( Fig. 1B; ref. 16). A spliced transcript (DA869846) found in multiple cDNA libraries prepared from various tissues, including the prostate, is within the region ( 18).
SNP rs887391 is ∼10 Mb centromeric to the PSA gene (KLK3), where a SNP near the 3′ end (rs2735839) was reportedly associated with PCa risk ( 9). However, because the SNP rs2735839 was significantly associated with higher PSA levels in subjects without PCa ( 9), there was concern that the PCa association was confounded by PSA screening ( 19). Therefore, we tested the association of rs887391 with plasma PSA levels among 1,722 control subjects in CAPS. The mean PSA levels were 1.48, 1.55, and 1.57 ng/mL for men who had 0, 1, or 2 copies of the T allele, respectively. The difference was not statistically significant assuming an additive model (P = 0.6). The PCa association for rs887391 at 19q13 observed in this study is unlikely to be confounded by PSA screening.
We also tested the association of rs887391 with disease aggressiveness, Gleason score, family history, PSA at diagnosis, and age at diagnosis. No significant association was found (Supplementary Table S4). This finding is similar to most PCa risk variants identified from GWAS, where no association with clinical characteristics was found, including the SNPs at 8q24, 17q12, 17q24, 10q11, and 11q13. This observation, however, is not surprising because these SNPs were identified by comparing all PCa cases with controls. Study designs such as case-case studies may be needed to identify associations with aggressive PCa.
In summary, this three-stage confirmation study in CAPS and JHH identified a novel locus at 19q13 that is potentially associated with PCa risk. Because the statistical evidence did not reach genome-wide significance level, it could represent a chance finding and should be considered as suggestive. It is also important to note that our study is underpowered to evaluate many of the 43 regions implicated in the CGEMS study because of the small sample size of our CAPS GWAS, the limited number of SNPs that we were able to examine, and the reliance on imputed SNPs for most of the SNPs examined. Additional studies are needed to further confirm the candidate regions discovered by the CGEMS study.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
Grant support: National Cancer Institute grants CA129684, CA105055, CA106523, and CA95052 (J. Xu) and CA112517 and CA58236 (W.B. Isaacs), and Swedish Cancer Society and Swedish Academy of Sciences (H. Grönberg).
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
We thank the multiple physicians and researchers for their contribution in designing and recruiting study subjects, including Dr. Hans-Olov Adami (for CAPS) and Drs. Bruce J. Trock and Alan W. Partin (for JHH), and the CGEMS for making the data available publicly. The support of P. Kevin Jaffe to W.B. Isaacs is gratefully acknowledged.
Note: Supplementary data for this article are available at Cancer Research Online (http://cancerres.aacrjournals.org/).
- Received August 27, 2008.
- Revision received January 12, 2009.
- Accepted February 5, 2009.
- ©2009 American Association for Cancer Research.