Inflammation has been implicated as an etiological factor in several human cancers. Growing evidence suggests that chronic inflammation may also play a role in the etiology of prostate cancer. Considering that genetic susceptibility is a major risk factor for this disease, we hypothesize that sequence variants in genes that regulate inflammation may modify individual susceptibility to prostate cancer. The lipopolysaccharide receptor Toll-like receptor 4 (TLR4) is a central player in the signaling pathways of the innate immune response to infection by Gram-negative bacteria and is an important candidate inflammatory gene. We performed a systematic genetic analysis of TLR4 sequence variants by evaluating eight single-nucleotide polymorphisms that span the entire gene among 1383 newly diagnosed prostate cancer patients and 780 age- and residence-matched controls in Sweden. We found an association between a sequence variant (11381G/C) in the 3′-untranslated region of the TLR4 gene and prostate cancer risk. The frequency of the variant genotypes (CG or CC) was significantly higher in the patients (24.1%) than in the controls (19.7%; P = 0.02). The frequency of risk genotypes among patients diagnosed before the age of 65 years was even higher (26.3%). Compared with men who had the wild-type genotype of this single-nucleotide polymorphism (GG), those with GC or CC genotypes had a 26% increased risk for prostate cancer (odds ratio, 1.26; 95% confidence interval, 1.01–1.57) and 39% increased risk increased risk for early onset prostate cancer (before age 65 years; odds ratio, 1.39; 95% confidence interval, 1.02–1.91). The risk attributable to this variant for prostate cancer in Sweden was estimated to be 4.9%. Although the biological mechanism of the observed association remains to be elucidated, our finding supports a role for a bacteria-associated response pathway, possibly acting via inflammation, in the development of prostate cancer.
Chronic or recurrent inflammation is known to play a causative role in the development of many human cancers, including cancers of the liver, esophagus, stomach, large intestine, and urinary bladder (1) . Inflammatory changes have long been recognized in prostate tissues, leading to speculation that chronic inflammation might contribute to prostate cancer development (2) . The role of prostate inflammation in prostate cancer was strongly implicated in the recently proposed theory that proliferative inflammatory atrophy serves as a precursor to prostatic intraepithelial neoplasia and to prostate cancer (3 , 4) . The identification of two candidate prostate cancer susceptibility genes (RNASEL and MSR1) that encode proteins with critical functions in host responses to infections provides additional support for a role of inflammation in prostate cancer development (5 , 6) .
Chronic infection and inflammatory processes that may lead to tumorigenesis are mediated in part through recognition of various stimuli by Toll-like receptors (TLRs; Ref. 7 ). Among these TLRs, TLR4 recognizes Gram-negative bacterial products, including lipopolysaccharide (8) , the antitumor compound taxol in the mouse but not in humans (9) , and human heat shock protein 60 (10) . Differential activation of TLR4 by this array of naturally occurring or synthetic ligands subsequently induces distinct downstream processes, including the expression of inflammatory genes as well as regulation of cell growth/apoptosis. Conceivably, improper regulation or compromised function of TLR4 may contribute to various inflammatory diseases, including cancer.
On the basis of the potential importance of inflammation and inflammatory genes in prostate cancer development, we hypothesized that sequence variants of TLR4 are associated with prostate cancer susceptibility. To test this hypothesis, we performed a systematic genetic analysis in a large population-based prostate cancer case–control study in Sweden.
MATERIALS AND METHODS
The cases studied came from the large-scale, population-based case–control study CAncer Prostate in Sweden (CAPS). The case participants were recruited from four of the six regional cancer registries that cover the entire population of Sweden. Each of these four registries serves one health care region (Northern, Central, Stockholm, and South Eastern); the four registries altogether encompass ∼6 millions inhabitants (67% of Sweden’s population). Reporting of newly diagnosed cancer cases to the registries is required by law for both the attending physician and pathologist; therefore, the registries include almost 100% of all cancers diagnosed in Sweden.
In the CAPS study, the source-person-time comprises men living in the area of Örebro and the northern part of Sweden (Västernorrland, Jämtland, Västerbotten, and Norrbotten) from January 1, 2001, as well as men living in the areas of Västmanland, Södermanland, Gävleborg, Dalarna, Värmland, and Uppland from July 1, 2001, until September 2002 (except for Jämtland and the county of Lycksele in Västerbotten, where the source-person-time ended March 1, 2002). The source-person-time was divided into two age-specific study bases. The first study base included men age 35–65 years of age living in all of the regions mentioned above. The second study base included men 66–79 years of age living only in the areas of Örebro, Västmanland, Södermanland, and the northern part of Sweden.
The inclusion criterion for cases in CAPS was pathologically or cytologically verified adenocarcinoma of the prostate (International Classification of Disease for Oncology = C61). After receiving notification of a new case, the administrator at the regional cancer registry mailed a letter to the treating physician informing him or her about the study. The physician was asked to indicate whether the patient was able to participate in the study. If so, the physician mailed a letter to the patient to introduce the study and asked him to send a reply letter to the administrator at the cancer registry. After approval from both the physician and the patient, the study secretaries sent a questionnaire and a kit with tubes for blood sampling to the eligible case. The self-administered questionnaire included questions concerning such items as diet (validated food questionnaire), family history, smoking, and physical activity.
In total, 1961 prostate cancer cases were invited to participate; of these, 1444 (73.6%) agreed to participate by donating a blood sample and/or answering the questionnaire. DNA was available for 1383 (95.8%) of the cases that participated. Clinical data that were not included in the Cancer Registry were obtained from the National Prostate Cancer Registry. 6 The cases were linked to the National Prostate Cancer Registry, and clinical information such as tumor-node-metastasis (TNM) stage, Gleason sum, prostate-specific antigen (PSA) level at the time of diagnosis, means of diagnosis, and primary treatment were obtained for 95.3% of the cases. The cases were thereafter classified as either localized (T1–2 and N0/NX and M0/MX; grade I-II; Gleason sum, 2–7; and PSA <100 ng/ml) or advanced (prone to progressive disease; T3/4 or N+ or M+, or grade III, or Gleason sum of 8–10, or PSA >100 ng/ml).
Control subjects were randomly selected from the updated Swedish Population Registry based on frequency matching to the expected age distribution (within 5 years) and geographic origin of the cases. After the controls were identified, a letter of introduction to the study was mailed to each control. Three to 4 weeks later, the same questionnaire and blood sampling kit that were used for the cases were mailed to the controls. Of the 1697 randomly selected controls invited, 866 (52.0%) agreed to participate by donating a blood sample and/or completing a questionnaire. DNA was available for 780 (90.9%) of the controls that participated. Eight potential control subjects were excluded after linkage to the National Cancer Registry revealed that they had a diagnosis of prostate cancer before inclusion.
To improve the response rate, cases and controls were recontacted three times; after 1–2 weeks with a follow-up letter, after 6–8 weeks with a new questionnaire and blood draw kit, and after ∼12 weeks with a phone call. The clinical characteristics of the study subjects are presented in Table 1 ⇓ . Mean age (age at diagnosis for cases and age at inclusion for controls) for the cases and controls were 66.60 and 67.90 years, respectively.
This study was approved by the Ethical Committees at the Karolinska Institutet and at Umeå University. Written informed consent was obtained from each subject.
All participants in this study were instructed to donate blood (4 × 10 ml) at the nearest health clinic or hospital. Samples were thereafter kept at room temperature and sent by overnight mail to the Medical Biobank at Umeå University. After arrival at the Biobank, leukocytes, erythrocytes, plasma, and serum were separated into different tubes. Samples were stored at −70°C until time for analysis. DNA samples were extracted from whole blood by standard methods and were shipped from Umeå, Sweden to the genotyping laboratory in the Center for Human Genomics at Wake Forest University. Each DNA plate contained 2 Centre d’Etude du Polymorphism Humain controls, a water blank, and blinded internal replicates. Researchers at Wake Forest University were blinded to case status. Genotyping was performed with the MassARRAY system (SEQUENOM). SpectraDesign software was used to generate the primers. Primer sequences and PCR conditions for these sequence variants are available at the authors’ website. 7
Selection of TLR4 single-nucleotide polymorphisms (SNPs).
The genomic structure of the TLR4 gene has not been completely elucidated. The gene is ∼11.5 kb and is composed of four exons. There are four known TLR4 transcript isoforms (A, B, C, and D) that result from alternative splicing of the four exons and different translation start sites. The poly(A) tail has been annotated only for isoform D. Our goal in this study was to evaluate common haplotypes of TLR4 sequence variants with use of a limited number of SNPs. To achieve this goal, we first selected a subset of reported SNPs from the National Center for Biotechnology Information (NCBI) dbSNP and IIPGA databases, using the criteria of minor allele frequencies ≥5% and a density of 1 SNP/kb across the TLR4 targeted genomic region, including 2 kb of the promoter, all exons, introns, and the predicted 3′-untranslated region (UTR). We also selected a subset of functional and coding SNPs regardless of the frequency and density. A total of 18 SNPs were selected (Fig. 1) ⇓ , including 4 reported nonsynonymous changes (D299G, V310G, E474K, and Q510H, defined based on isoform A). The previously described nonsynonymous change T399I was not selected because it has been reported to be in strong linkage disequilibrium (LD) with the nonsynonymous change D299G (11) . We then genotyped these SNPs in a subset of 96 control subjects. Nine of these SNPs, including E474K and Q510H, were monomorphic in these 96 subjects. The nonsynonymous change V310G was observed only once. The remaining eight SNPs were observed multiple times and thus were selected for genotyping among all case and control subjects whose DNA samples were available at the time of this study.
TLR4 RNA Analyses.
The 3′-UTR boundary of TLR4 mRNA was determined by 3′ rapid amplification of cDNA end (RACE) on a prostate poly(A)+ RNA template. We used the SMART RACE cDNA amplification kit (Clontech) according to the manufacturer’s recommendations. The gene-specific primer used in the RACE reaction was 5′-ggatccctcccctgtacccttctcactgccaggag-3′.
Hardy–Weinberg equilibrium tests for each of the eight sequence variants and pair-wise LD tests for all pairs of the eight sequence variants were performed with the Fisher probability test statistic, as described by Weir (12) . For each test, 10,000 permutations were performed, and the test statistic of each replicate was calculated. Empirical Ps for each test were estimated as the proportion of replicates found to be as probable as or less probable than the observed data, as implemented in the software package Genetic Data Analysis. The estimates of pair-wise LD (D′) were calculated with the SAS/Genetics computer software package.
Allele frequency differences between the two groups were tested for each SNP by the χ2 test with 1 degree of freedom (df). Genotype frequency differences were also tested by the χ2 test with 2 degrees of freedom. Both tests were performed with the SAS/Genetics computer program. Odds ratios (ORs) of prostate cancer for the variant-allele carriers (homozygous and heterozygous) versus homozygous wild-type allele carriers were estimated by unconditional logistic regression and adjusted for age and geographic regions. Attributable risk was estimated using the formula: 100% × p(OR − 1)/[p(OR − 1) + 1], where p is the prevalence of risk genotypes in a population (13) .
Haplotype frequency was estimated with the new statistical method proposed by Stephens et al. (14) , as implemented in the computer program PHASE. 8 When several runs were performed with different values for the seed of the random number generator, the goodness-of-fit values were similar among the different runs. Association between the haplotypes and prostate cancer risk was performed with a score test developed by Schaid et al. (15) , as implemented in the computer program HAPLO.SCORE. 9
Eight SNPs, including one nonsynonymous change (D299G), were genotyped in all study subjects, including 1383 cases and 780 controls. Because the SNP 7449G/T (rs2149356) significantly deviated from Hardy–Weinberg equilibrium in both cases (P < 0.0001) and controls (P = 0.0006), it was removed from further statistical analysis. The remaining seven SNPs were in Hardy–Weinberg equilibrium among cases and controls, respectively (all P > 0.05). These SNPs were in strong LD: most of the pair-wise D′ estimates were 1.0 with the lowest one being 0.63.
Testing for allele frequency differences between cases and controls revealed one SNP (11381G/C) with a marginally significant difference between cases and controls (Table 2) ⇓ . The frequency of the C allele for the SNP 11381G/C was 0.13 and 0.11 in cases and controls, respectively (P = 0.05). The genotype frequencies of CC or CG for this SNP were 24.1% and 19.7% in cases and controls, respectively. The difference in genotype frequencies between cases and controls was larger (χ2 = 7.77; df, 2; P = 0.02). In logistic regression analyses (Table 3) ⇓ , men who had the genotypes CC or CG had a 26% increased risk (OR = 1.26, 95% confidence interval, 1.01–1.57) for prostate cancer compared with men with the genotype GG for this SNP (adjusted for age and geographic regions). Furthermore, men who had the genotypes CC or CG were at a 39% increased risk (OR = 1.39; 95% confidence interval, 1.02–1.91) for an early diagnosis of prostate cancer. The high-risk genotypes of SNP 11381G/C were present in 19.7% of control individuals, 24.1% of all cases, and 26.3% of cases diagnosed at age <65 years. The overall proportion of prostate cancer risk in this population that was attributable to the risk genotypes was 4.9%.
There was no statistically significant association between the other six SNPs and prostate cancer risk. The frequency of the variant allele G of the nonsynonymous change D299G was similar in cases (0.051) and in controls (0.057; P = 0.31). Among the prostate cancer patients only, no association was found between any of the SNPs and clinical characteristics such as age of diagnosis (<65 versus ≥65 years), Gleason score (<8 and ≥8), or PSA levels (data not shown). The difference in the genotype frequency of 11381G/C between cases with a Gleason score <8 (24.2%) and ≥8 (23.8%) was not statistically significant. The age-adjusted means for total PSA levels were 16.0, 99.1, and 83.7 among the cases with the CC, CG, and GG genotypes of this SNP, respectively (P = 0.18).
Haplotype analysis did not provide additional support for an association between TLR4 haplotypes and prostate cancer risk (data not shown). There were seven inferred haplotypes in this population based on these seven SNPs. The overall frequencies of these haplotypes were not significantly different (χ2 = 6.52; df, 6; P = 0.37). The haplotype containing the C allele of SNP S61381 (haplotype A-T-A-A-A-G-C of SNPs −2026A/G, −1607T/C, 3622A/C, 3748A/C, D299G, 9615G/A, and 11381G/C) was present at a higher frequency in cases (12.43%) than in controls (10.35%; P = 0.04).
Because there has been some controversy regarding the boundary of the 3′-UTR for the TLR4 gene, we performed a 3′ RACE experiment to further define the 3′ end of TLR4 mRNA expressed in prostate. Our 3′ RACE results indicated that the 3′-UTR of TLR4 is 1691 bp longer than the 1127-bp NCBI-annotated 3′-UTR, with a total length of 2818 bp. These result indicate that the last SNP in our association study, 11381G/C, is actually located in the middle of the TLR4 3′-UTR.
There is growing evidence that chronic inflammation may play a role in the development of cancer within several organs, including the prostate. We are in the process of systematically evaluating possible associations between sequence variants of numerous inflammatory genes and prostate cancer risk. TLR4 is a central player in the signaling pathways that control the innate immune response and was selected to be among the first group of genes to be evaluated. We hypothesized that variations in TLR4 expression and function may be associated with individual susceptibility to cancer. To test this hypothesis, we performed a systematic genetic analysis of TLR4 sequence variants in ∼2100 men with or without prostate cancer in Sweden. We found a significant association between a sequence variant in the 3′-UTR of the TLR4 gene. This variant conferred a 26% increased risk of prostate cancer in the study population overall and a 39% increased risk of being diagnosed with prostate cancer before age 65. The risk of prostate cancer in Sweden attributable to this variant was estimated to be 4.9%. Our study represents the first comprehensive evaluation of association between sequence variants of the TLR4 gene and cancer susceptibility.
Although the observed ∼1.3–1.4-fold increase in risk is modest, this is probably consistent with the magnitude of risk that we expect to observe for such a heterogeneous disease. Because genes in multiple pathways (such as androgen metabolism, growth factor, phase I and II detoxification, DNA repair, and inflammation) may alter the risk for prostate cancer, each individual gene is likely to contribute only a modest risk. This phenomenon is observed in other complex diseases, as reported in the most recent meta-analysis of genetic association studies in complex diseases (16) . After evaluating 301 published studies that attempted to replicate reported disease associations for 25 different genes, the authors of that pooled meta-analysis confirmed the disease associations for 8 of those genes. Interestingly, seven of these eight genes were associated with modest estimated genetic effects (OR between 1.07 and 1.76) in the pooled analyses. They concluded that there are probably many common variants in the human genome with modest but real effects on common disease risk and that studies using large samples are needed to convincingly identify such variants.
When considering the likelihood that this finding represents a true association between the SNP 11381G/C and the disease, it is important to examine the possibility of spurious effects due to chance (multiple comparisons), as well as confounding due to population stratification. Although there were seven significance tests for the primary hypothesis, these tests were not independent because these SNPs are in strong LD. It is unclear how we might make an appropriate adjustment; however, the degree of inflated type I error should be minimal. Similarly, although the observed differences in allele and genotype frequencies could be due to differences in the genetic background of the two groups, rather than disease status (i.e., population stratification), this is unlikely in this study. Genetic heterogeneity is less of a concern in Sweden than in the United States. In addition, this was a carefully designed population-based study. Almost all of the patients that met the inclusion criterion enrolled as participants in this study. Control subjects were frequency matched to cases based on residence area and age. Furthermore, the higher frequency of the risk genotypes (CC or CG) among cases diagnosed before the age of 65 years (26.3%) compared with either cases diagnosed at age 65 years or older (22.9%) or controls (19.7%) provided additional support for this association. Finally, the large number of study subjects decreases the possibility for statistical fluctuation and significantly increases our confidence in interpreting the results.
Although SNP 11381G/C is outside the NCBI-annotated 3′-UTR of TLR4, previous studies have not clearly defined this 3′-UTR. In fact, several expressed sequence tag clones were then identified by our BLAST search against the human expressed sequence tag database, in which we used ∼2 kb of genomic sequence beyond the NCBI-annotated 3′-UTR. This computational analysis result indicated that the 3′-UTR boundary could be further downstream of the NCBI-annotated 3′ end of TLR4. Further support was provided by our experimental 3′ RACE data, which showed that the 3′-UTR of TLR4 is 1692 bp longer than the NCBI annotation. Although SNP 11381G/C is located 78 bp beyond the 1127-bp NCBI-annotated 3′-UTR, our computational and laboratory studies show that this SNP is actually located in the center of the 2818-bp TLR4 3′-UTR.
The observed association of SNP 11381G/C may be due to biological impact associated with this variant, or it may indirectly reflect another unobserved causal variant of TLR4 that is in LD with this SNP. For example, SNP 11381G/C may itself influence the stability of the mRNA species. Alternatively, other SNPs in this region that are in strong LD with SNP 11381G/C may alter an AU-rich element motif and affect mRNA stability. There are at least 20 other known SNPs in this 2818-bp 3′-UTR. A large-scale evaluation of these SNPs and functional assessments are needed to address this question. On the other hand, it is unlikely that the observed association reflects the effects of other genes in this region, as the closest known gene [deleted in bladder cancer chromosomal region (DBCCR1)] is 2.5 Mb 3′ from TLR4.
Several reported nonsynonymous changes were either not observed (E474K and Q510H) or observed only once (V310G) among our sample of 96 controls. We therefore could not assess their association with prostate cancer in our study population. Another nonsynonymous change, D299G, was relatively frequent in our population (∼11% of control subjects had the variant genotypes) but was not associated with prostate cancer risk. The power calculation based on the prevalence of this SNP and the size of our study population suggested 80% power to detect an association at 5% significance level (two-sided test) if the SNP conferred at least a 1.45-fold increased risk. Multiple studies on the association between D299G and various phenotypes and diseases have been published, including the ability to recognize lipopolysaccharides, susceptibility to Gram-negative infections, premature birth, atherosclerosis, and asthma. The overall results were variable, with some reports providing positive findings (17, 18, 19, 20) whereas others did not observe an association (21, 22, 23) .
In summary, our study provided evidence of an association between a TLR4 sequence variant and prostate cancer risk. More studies are needed to confirm or refute this finding in independent populations and to understand the mechanism by which TLR4 sequence variants affect the expression and function of TLR4 in the signaling pathways that control innate immune response. Hopefully, our finding will also encourage additional research interest on the possible role of bacterial infection and inflammation in the development of prostate cancer.
We thank all study participants in the CAPS1 study. We thank Ulrika Lund for coordinating the study at Karolinska Institute as well as all of the urologists who recruited their patients to this study and provided clinical data to the National Prostate Cancer Registry. We also thank Karin Andersson, Susan Okhravi-Lindh, Gabriella Thorén-Berglund, and Margareta Åswärd at the Regional Cancer Registries in Umeå, Uppsala, Stockholm-Gotland, and Lindköping. In addition, we thank Sören Holmgren and the personnel at the Medical Biobank in Umeå for skillfully handling the blood samples.
Grant support: Grants from the Swedish Cancer Foundation and Spear Grant from the Umeå University Hospital (Umeå, Sweden). Also partially funded by the Center for Human Genomics at Wake Forest University School of Medicine.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Requests for reprints: Jianfeng Xu, Wake Forest University School of Medicine, Medical Center Blvd, Winston-Salem, NC 27157. Phone: (336) 713-7500; Fax: (336) 713-7566; E-mail:
↵9 http://www.mayo.edu/statgen for the S-PLUS programming language; http://www.wfubmc.edu/genomics for the R programming language.
- Received October 18, 2003.
- Revision received February 6, 2004.
- Accepted February 9, 2004.
- ©2004 American Association for Cancer Research.