| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
Epidemiology and Prevention |
Departments of 1 Health Sciences Research and 2 Laboratory Medicine and Pathology, Mayo Clinic College of Medicine, Rochester, Minnesota; and 3 H. Lee Moffitt Cancer Center and Research Institute, Tampa, Florida
Requests for reprints: Celine M. Vachon, Department of Health Sciences Research, Charlton 6-239, Mayo Clinic, 200 First Street SW, Rochester, MN 55905. Phone: 507-284-9977; Fax: 507-266-2478; E-mail: vachon.celine{at}mayo.edu.
| Abstract |
|---|
|
|
|---|
| Introduction |
|---|
|
|
|---|
Epidemiologic risk factors associated with MD, including strong inverse associations with body mass index (BMI) and age, account for only 20% to 30% of variation in the trait (7, 10, 11). Genetic factors and the interaction between genes and environment likely account for the remaining variation (12). Evidence for a genetic influence comes from familial aggregation (13, 14), family-based segregation (13), and twin (12, 15) studies. A large study in two populations of monozygotic and dizygotic twins estimated heritability from 65% to 74% for age-adjusted to 60% to 67% for multivariable-adjusted analyses (12). Gene association studies have also been used to identify genetic factors involved in increased MD (16–23), but few findings have been replicated to date (20, 23), illustrating that the selection of candidates genes based on our limited understanding of the biology of MD has not been particularly instructive.
Genetic linkage analysis represents another approach to identify genes for MD, one that does not require complete understanding of the biology of the trait. Previous work suggesting evidence for a major gene effect (13) and the evidence for a high degree of heritability of MD (12) justify a linkage analysis approach to identify genes influencing the trait. Here, we present the first genome-wide linkage analysis of MD in a large collection of families and provide strong evidence for a gene or genes influencing MD on chromosome 5p.
| Materials and Methods |
|---|
|
|
|---|
Simulation studies were done to identify the families most informative for linkage analyses. A subset of 90 of the 426 families was selected, and 1,146 family members were invited to provide a blood or buccal sample as a source of DNA; 901 (79%) consented. After the exclusion of 12 individuals due to Mendelian (familial) inconsistencies across markers, the final sample included 89 families, with 889 Caucasian individuals (133 men, 756 women). As part of the parent study, women provided the location of the most recent mammogram and permission to obtain and digitize their mammograms. Mammograms were requested from clinics across the United States, and all were recent mammograms done over the 1990 to 2001 period when national standards were in place for mammography. Among the 737 age-eligible women, we retrieved the mammograms of 658 (89%). Of women with mammograms, 618 (82%) had both craniocaudal and mediolateral oblique views available. Five percent of women had a breast cancer diagnosis during the follow-up period (2000–2002); for these women, mammograms before the diagnosis were used.
The protocol was approved by the Mayo Clinic Institutional Review Board.
Percent MD phenotype estimation. Original mammograms were obtained on 658 women and digitized on a Lumiscan 75 scanner with 12-bit grayscale depth. The pixel size was 0.130 x 0.130 mm2 for both the 18 x 24- and 24 x 30-cm2 films. MD was estimated for each view using a computer-assisted thresholding program (ref. 3; Fig. 1 ) with proven reliability (25). For this study, MD from the mediolateral oblique and craniocaudal views were averaged and used as the phenotype.
|
Genotyping methods. The genome-wide screen initially consisted of 400 microsatellite DNA markers across the chromosomes from the ABI Prism Linkage Mapping Set version 2.5 (PE Applied Biosystems; ref. 27). Six markers (on chromosomes 1, 4, 13, 14, and 16) were replaced (three of these with two new markers) due to the presence of null alleles (one marker), amplification problems (four markers) or strong allele bias (one marker), bringing the total number of markers to 403. The average information content across all chromosomes was 85% (25th–75th percentiles; 83.2–88.4%), and the average intermarker distance was 8.99 cM (25th–75th percentiles; 6.1–11.1 cM) based on the deCODE linkage map (28); the five largest gaps ranged from 21.0 to 28.0 cM and were located on chromosomes 1, 5q, 6p, 6q, and 8q.
The genome scan was done within the Mayo Clinic Genome Shared Resource by standard methods (27). Genotyping for the fine mapping of the identified candidate region consisted of 21 markers spaced 1.6 cM (±0.27) apart and was done by deCODE Genetics (29). A total of 91 patient samples, 2 duplicates, and 3 controls from the Centre d'Etude du Polymorphisme Humain (CEPH) were run per 96-well plate for both the genome scan and fine mapping to evaluate the quality of the genotyping.
Genotype Quality
The genotype data from the initial scan were evaluated for genotyping accuracy using locally written procedures to assess Mendelian consistency within each of the families and the PREST (30) program to assess relationships. After accounting for relationship misclassifications, all markers were again assessed for Mendelian consistency. From the PREST results, we found convincing evidence, in eight families, of full siblings who were actually half sibs and corrected the relationships. In addition, we identified one individual, labeled as the mother, who was not blood related to her offspring, and excluded her from all analyses.
Pairwise comparisons were made for all pairs of subjects to examine the percentage of identical genotypes across all markers to ensure that all monozygotic twins were correctly identified, and no plating errors had occurred. No discrepancies were found. The potential for departures from Hardy-Weinberg equilibrium (HWE) was assessed through a resampling approach where a single individual was randomly selected from each family. All markers for both the genome scan and fine mapping were in HWE defined by P > 0.05. For the genome scan and fine mapping, all duplicate samples were concordant across all markers, and CEPH controls matched across all plates. The genome screen resulted in 357,172 (98.9%) genotypes that were Mendelian consistent and useable in analysis. For the fine mapping, 48,566 genotypes (99.3% of those called) were available for analysis.
Statistical analyses. Quantitative trait linkage analyses using a variance components approach (31) were done using the EMVC software package, which uses an expectation-maximization algorithm to estimate genetic variance components (32). The multipoint identical-by-descend (IBD) sharing probabilities used in EMVC were estimated in SIMWALK2 (33) for the autosomes. IBD estimates for the X chromosome are not available in SIMWALK2. We therefore used the utilities incorporated in the MERLIN (34) suite of linkage analysis tools (MINX) to perform variance component linkage analyses along the X chromosome after breaking the large families into smaller subsets to enable computation. These variance component approaches estimate the variability in the trait explained by genetic sharing specifically attributable to a single locus and by genetic sharing broadly attributable to familial relationships. Variance components were estimated by maximum likelihood while adjusting for nongenetic correlates of MD. Log odds for linkage (LOD) scores were obtained by comparing the likelihoods from models with and without accounting for locus-specific variability (31). LOD scores >3.3 were considered to provide significant evidence in favor of linkage; LOD scores more than 2.2 were considered suggestive for linkage (35). Support intervals were identified as the continuous genetic region surrounding the maximum LOD score that had LOD scores no smaller than the maximum LOD score minus one.
Variance component models were evaluated for the primary phenotype, mean MD (mean of craniocaudal and mediolateral oblique views) adjusted only for age and also adjusted for covariates previously associated with MD in this cohort (10). Covariates were obtained from a self-administered questionnaire completed within a median of 4 months (interquartile range, 0.96–15.7 months) of the mammogram date and included age (1/age), body mass index (1/BMI), menopausal status, hormone therapy (HT), lifetime alcohol consumption, oral contraceptive (OC) use, education, number of live births, age at first live birth and pack-years smoking. Only women with complete information on the primary phenotype and all of the covariates (n = 583) were used in the linkage analyses. However, the genetic information from all those who had been genotyped, including males (n = 133) as well as women with missing covariate (n = 38) or phenotype data (n = 138), was used in the estimation of IBD sharing described above. Secondary analyses were done after removing the 5% of women who developed breast cancer over the course of the study.
Because BMI is inversely correlated with MD (r = –0.50 in our sample), we wanted to ensure we were identifying genes for MD, not for BMI. Thus, we did linkage analyses for mean MD with and without adjusting for BMI. In addition, for those chromosomes found to have regions of suggestive linkage, we did linkage analyses of BMI with and without adjustment for MD (the genome-wide scan of BMI is subject of a separate report). Additionally, we estimated the genetic correlation between these two traits via maximum likelihood methods and did bivariate linkage analyses (36) on both MD and BMI, again at our suggestive loci, to assess whether one genetic locus influenced both MD and BMI. A high genetic correlation would suggest that much of the correlation between two traits (in this case BMI and MD) could be explained by the same genes. In the case where a putative gene is actually linked to both traits, we would expect to see a significant increase in the LOD for the bivariate compared with the univariate linkage analyses. All of these models were run in EMVC while adjusting for the covariates listed above.
After identifying a genomic region where there was evidence for linkage following fine mapping, we did two additional analyses to further inform our locus. First, we estimated the magnitude of the effect corresponding to the putative locus by extracting the variance component corresponding to the locus-specific genetic sharing at the location with the maximum LOD score. The ratio of this variance component to the overall trait variance reflects the proportion of the variability explained by the locus, although this is likely an overestimate, because genetic effect sizes based on an initial linkage scan are known to be upwardly biased (37). Second, we did a bootstrap simulation to estimate the confidence in our region. We resampled families with replacement to form 10,000 bootstrap data sets and computed LOD scores for each bootstrap sample to determine how consistently they were >3.3.
| Results |
|---|
|
|
|---|
Both mediolateral oblique and craniocaudal mammogram views were available on 618 of the 756 genotyped women, and 583 (94%) women also had complete covariate information and were used in analyses. Mean MD was 26.5% and ranged from 0% to 73.2%, with a SD of 15.8. The distribution of mean MD was slightly positively skewed and had a slight negative kurtosis (skewness, 0.57; kurtosis, –0.20). After accounting for nongenetic covariates, the MD-adjusted distribution conformed well to the normality assumption; the Shapiro-Wilk test did not show evidence for departures from normality (P = 0.23); the skewness was 0.15, and the kurtosis was 0.01. Overall, the women in the genome screen were primarily postmenopausal (69%), parous (85%), high school educated (84%), moderate drinkers (82% weekly or monthly), and 41% had used postmenopausal hormones (Table 1 ). MD was positively associated with education, lifetime alcohol consumption, current OC and HT use, years of OC use, nulliparity, and premenopausal status and inversely associated with age and BMI. MD was not associated with pack-years smoking (r = –0.02, P value, 0.56) in this set of families. The greatest variability in MD was explained by BMI and age (Table 1).
|
|
|
|
| Discussion |
|---|
|
|
|---|
Unraveling the genetic components of complex diseases is enhanced through focus on the genetics of heritable risk factors for the disease. For example, in the cardiovascular disease arena, studies focus on the risk factors of cholesterol, blood pressure, lipoproteins, and clotting factors (38, 39); the same has been seen in type 2 diabetes, with linkage analyses of BMI and insulin response (40). By analogy, we focused our efforts on a quantitative trait that is strongly associated with breast cancer risk in more than 40 studies to date (1) and has a proven genetic component (12–14). The locus for MD on chromosome 5p has not been previously identified in linkage analyses for breast cancer [e.g., chromosomes 8p (41), 13q (42), 2q (43), and 4p (44)], underscoring the merit of this approach to identify novel genes that may influence both MD and breast cancer risk.
Although this is the first family linkage study of MD, several genetic association studies have been conducted to identify genes influencing MD (16–23). Given the hormonal basis of breast cancer, most studies have explored genes involved in hormone metabolism (17–20, 23) or the insulin-like growth factor (IGF) pathway (16, 21, 22). No clear associations have been established, and these candidate genes have not explained a large portion of variation in MD. None are located within the identified chromosome 5p region (Supplementary Table), but interestingly, IGF-I does lie just outside the 2-LOD region surrounding the maximum LOD on chromosome 12q.
MD is associated with stromal fibrosis and epithelial proliferation with or without atypia (7). Positive correlations of percent MD have also been found with cellularity of breast tissue (45, 46); stained areas of collagen (45–47); the stromal proteins, lumican and decorin (47); and tissue inhibitor of matrix metalloproteinase-3 (TIMP-3) (ref. 45). These findings implicate genes encoding proteins involved in proliferative activity, maintenance, and regulation of the breast epithelium, stroma, extracellular matrix, and fat as candidate modifiers of MD (46). Several of the 45 genes in the 1-LOD interval surrounding the chromosome 5p peak LOD are involved in these processes and could influence MD (Supplementary Table). The prolactin receptor, which lies between the 1- and 2-LOD regions surrounding the maximum LOD on 5p, is also a strong candidate because MD is positively associated with prolactin in postmenopausal women (48). However, our linkage signal might not be due to genes, but other important regulators in the noncoding DNA, such as microRNAs. In addition, the importance of intronic regions is only beginning to be understood (49).
Our study had several strengths, including a large sample of families with high participation rates and detailed risk factor information. Adjustment for other risk factors, in particular BMI, proved to be critical to efforts to identify the linkage signal. BMI and MD likely operate on breast cancer through different biological pathways, but their strong negative correlation leads to underestimation of effects of either pathway if not adjusted for each other (50). Using the semiautomated measure of percent MD, which has been shown to provide the strongest association with breast cancer risk across studies (1), is also a strength. Finally, the genes in the identified candidate region on chromosome 5 have not been previously examined with breast cancer risk. As such, a link with MD would provide new insight into mechanisms through which some breast cancers might occur.
This study was limited to Caucasian women in the Midwest, and it is not clear if our findings can be generalized to other populations. Also, although our data are provocative, rigorous examination of genes in the chromosome 5p region in relation to levels of MD will be required to identify the specific gene or genes contributing to the linkage signal. In addition, follow-up of the chromosome 12–suggestive regions could provide new insight to genes for MD.
In conclusion, there may be at least one gene influencing MD on chromosome 5p. The identification of genes for MD could translate to both the identification of novel genes for breast cancer and biological targets for the reduction of density.
| Acknowledgments |
|---|
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
| Footnotes |
|---|
Received 3/22/07. Revised 5/18/07. Accepted 6/25/07.
| References |
|---|
|
|
|---|
and progesterone receptor polymorphisms on the effects of hormone therapy on mammographic density. Cancer Epidemiol Biomarkers Prev 2006;15:462–7.
gene and mammographic density. Cancer Epidemiol Biomarkers Prev 2005;14:2655–60.This article has been cited by other articles:
![]() |
M. Kataoka, A. Antoniou, R. Warren, J. Leyland, J. Brown, T. Audley, and D. Easton Genetic Models for the Familial Aggregation of Mammographic Breast Density Cancer Epidemiol. Biomarkers Prev., April 1, 2009; 18(4): 1277 - 1284. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Ursin, E. O. Lillie, E. Lee, M. Cockburn, N. J. Schork, W. Cozen, Y. R. Parisky, A. S. Hamilton, M. A. Astrahan, and T. Mack The Relative Importance of Genetics and Environment on Mammographic Density Cancer Epidemiol. Biomarkers Prev., January 1, 2009; 18(1): 102 - 112. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. S. Dite, L. C. Gurrin, G. B. Byrnes, J. Stone, A. Gunasekara, M. R.E. McCredie, D. R. English, G. G. Giles, J. Cawson, R. A. Hegele, et al. Predictors of Mammographic Density: Insights Gained from a Novel Regression Analysis of a Twin Study Cancer Epidemiol. Biomarkers Prev., December 1, 2008; 17(12): 3474 - 3481. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. A. Douglas, M.-H. Roy-Gagnon, C. Zhou, B. D. Mitchell, A. R. Shuldiner, H.-P. Chan, and M. A. Helvie Mammographic Breast Density--Evidence for Genetic Correlations with Established Breast Cancer Risk Factors Cancer Epidemiol. Biomarkers Prev., December 1, 2008; 17(12): 3509 - 3516. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| Cancer Research | Clinical Cancer Research |
| Cancer Epidemiology Biomarkers & Prevention | Molecular Cancer Therapeutics |
| Molecular Cancer Research | Cancer Prevention Research |
| Cancer Prevention Journals Portal | Cancer Reviews Online |
| Annual Meeting Education Book | Meeting Abstracts Online |