Germ line inactivating mutations in BRCA1 confer susceptibility for breast and ovarian cancer. However, the relevance of the many missense changes in the gene for which the effect on protein function is unknown remains unclear. Determination of which variants are causally associated with cancer is important for assessment of individual risk. We used a functional assay that measures the transactivation activity of BRCA1 in combination with analysis of protein modeling based on the structure of BRCA1 BRCT domains. In addition, the information generated was interpreted in light of genetic data. We determined the predicted cancer association of 22 BRCA1 variants and verified that the common polymorphism S1613G has no effect on BRCA1 function, even when combined with other rare variants. We estimated the specificity and sensitivity of the assay, and by meta-analysis of 47 variants, we show that variants with <45% of wild-type activity can be classified as deleterious whereas variants with >50% can be classified as neutral. In conclusion, we did functional and structure-based analyses on a large series of BRCA1 missense variants and defined a tentative threshold activity for the classification missense variants. By interpreting the validated functional data in light of additional clinical and structural evidence, we conclude that it is possible to classify all missense variants in the BRCA1 COOH-terminal region. These results bring functional assays for BRCA1 closer to clinical applicability. [Cancer Res 2007;67(4):1494–501]
- functional assay
- unclassified variant
- breast cancer
The breast and ovarian cancer predisposition gene BRCA1 displays large allelic diversity with several thousand different alleles documented thus far (Breast Cancer Information Core Database). 16 This is consistent with the expected allelic structure of genes that determine rare monogenic diseases ( 1). A large portion of the documented alleles have had their disease association determined by inference from genetic and biochemical evidence that indicates that even very small truncations of 11 COOH-terminal amino acids result in a protein with compromised function ( 2– 4). Furthermore, deletion analysis suggests that truncation of only eight amino acids may abrogate function ( 5). However, a significant number of BRCA1 alleles, mostly containing missense changes, have proved more difficult to assess for disease association. These are termed unclassified variants (UCV) or variants of uncertain significance. The lack of conclusive genetic information is primarily due to low frequency of each individual UCV.
The problem generated by these UCVs is widely recognized and the need to provide risk assessment to individuals in high-risk families has brought several advances aimed at classifying these variants as follows: (a) Bayesian methods to analyze pedigrees ( 6); (b) use of information from interspecies sequence variation ( 7– 9); (c) integrated methods combining information from different sources in a comprehensive framework ( 10, 11); (d) functional assays to assess the effect of amino acid changes on protein function ( 12– 16); (e) methods based on co-occurrence with a deleterious mutations ( 17); and (f) structure-based analysis to generate computation prediction models ( 18, 19). Although still far from clinical application, these methods have provided important information.
Here we apply a transcription-based assay to assess the effect of 22 variants (R1443G, V1534M, D1546N, L1564P, P1614S, E1644G, S1655F, L1664P, T1700A, G1706E, V1713A, V1736A, G1738R, G1738E, R1753T, I1766S, L1764P, Q1785H, G1788D, E1794D, V1804D, and P1806A) on the activity of the BRCA1 COOH terminus (amino acids 1,396–1,863) and interpret the findings in light of all available clinical, genetic, and structural information. These UCVs were found in individuals from families with breast or ovarian cancer in which no other deleterious mutation in BRCA1 or BRCA2 was found, and thus it is not clear whether they are causally related to disease. We also assessed the function of constructs representing the combination of the common polymorphism S1613G with other rare variants. The transcription-based method, used as a monitor for the integrity of the BRCT domain and its flanking regions, has shown an excellent agreement with existing genetic data ( 12, 13).
Several findings with significant implication for genetic counseling and risk assessment emerged from this study, which included a meta-analysis of 47 BRCA1 UCVs. The study determined tentative activity thresholds for functional classification of UCVs, identified regions less tolerant of mutations, and raised the testable hypothesis that there are no moderate-risk BRCA1 missense variants.
Materials and Methods
Constructs. Control constructs containing the wild-type (wt) BRCA1, S1613G, M1775R, and Y1853X were previously described ( 12). Mutations R1443G, V1534M, D1546N, L1564P, P1614S, E1644G, S1655F, L1664P, T1700A, G1706E, V1713A, V1736A, G1738R, G1738E, R1753T, I1766S, L1764P, Q1785H, G1788D, E1794D, V1804D, and P1806A were introduced by splicing by overlapping extension PCR ( 20) using plasmid p385-BRCA1 as template. Primers sequences are available on request. For each mutation, both products (5′ and 3′ regions) were combined and used as a template for a final round of PCR using 24ENDT and UX13 primers ( 12). To obtain the double mutants S1613G/V1534M, the same procedure and primers described above were used with the pCDNA3βHA-BRCA1 (gift from Ralph Scully, Beth Israel Deaconess Medical Center, Department of Medicine, Harvard Medical School, Boston, MA), which contains the S1613G polymorphism, as template. For the S1613G/H1402Y, S1613G/L1407P, S1613G/M1628V, and S1613G/T1685I double mutants, previously described primers ( 12) were used with the pCDNA3βHA-BRCA1 plasmid as template. The final PCR products were then digested with BamH1 and EcoR1 and ligated to pLex9 or pGBT9 vectors. All mutations were confirmed by sequencing. To obtain GAL4-DBD fusions in a mammalian expression vector, pGTB9 constructs were digested with HindIII and BamH1, then a 1.8-kb band was isolated and ligated into equally digested pCDNA3.
Transcription assay in yeast and in mammalian cells. The transcriptional assays were done essentially as described ( 12, 21). Briefly, Saccharomyces cerevisiae strain EGY48 was cotransformed with the effector plasmid pLex9, which contains a fusion of LexA DNA binding domain and BRCA1 amino acids 1,396 to 1,863, with different variants and the plasmid reporter pRB1840, which contains a lacZ gene under the control of one LexA operator ( 22). At least three individual clones for each variant were tested for liquid β-galactosidase assays using o-nitrophenyl-β-d-galactopyranoside and the assays were done in triplicates. Activity was determined as a comparison to wt BRCA1 and S1613G (positive controls) or to M1775R and Y1853X (negative controls). For assays in mammalian cells, we used pG5Luc as a reporter and transfections were normalized with an internal control, phGR-TK (Promega, Madison, WI), which contains a Renilla luciferase gene under a constitutive TK basal promoter. Transfections were done with human 293T cells in triplicate using Fugene 6 (Roche, Indianapolis, IN), harvested 28 h posttransfection, and luciferase activity was measured using a dual luciferase assay system (Promega). Western blots were incubated with α-GAL4 DBD mouse monoclonal antibody (Clontech, Mountain View, CA) or α-LexA DBD rabbit polyclonal antibody (Upstate, Charlottesville, VA). We used the SAS application package to calculate confidence intervals for validation.
Structural analysis. For residues located at the BRCT domain, we did an analysis based on the 1t29 structure ( 23) and a sequence alignment of 13 BRCA1 orthologues created with SAM-T2K homology search software ( 24) and subsequently hand-edited. The species used in the alignment are Homo sapiens (AAA73985), Pan troglodytes (AAG43492), Gorilla gorilla (AAT44835), Pongo pygmaeus (AAT44834), Macaca mullata (AAT44833), Canis familiaris (AAC48663), Bos taurus (AAL76094), Monodelphis domestica (AAX92675), Mus musculus (AAD00168), Rattus norvegicus (AAC36493), Gallus gallus (AAK83825), Xenopus laevis (AAL13037), Tetraodon nigroviridis (AAR89523). We used the RenderByAttribute routine in the molecular visualization program Chimera to color each residue position according to its percent conservation in the alignment and to visualize patches of conserved residues on the BRCT domain surfaces ( 25). For each of the eight BRCT UCVs, we computationally replaced the wt side chain with that of the variant and optimized the conformation of the variant backbone and side chain atoms with the mutate model routine in MODELLER ( 26). Hydrogen bonds in wt and variant protein models were calculated with Chimera FindHBond routine using default parameters. A more detailed description of the method and a computational analysis of 36 BRCT UCVs that integrates features of protein tertiary structure, evolutionary conservation, and amino acid residue properties can be found in a companion paper to our current study ( 27).
Functional analysis of missense variants. The location of the variants studied, as well as the negative and positive controls, is indicated by arrowheads in Fig. 1 . Two known deleterious/high-risk variants, M1775R and Y1853X, were used as negative (i.e., loss-of-function) controls and S1613G (a neutral polymorphism) and wt BRCA1 (amino acids 1,396–1,863) were used as positive controls ( 12). Because African Americans and Hispanics may disproportionately receive uninformative results ( 28, 29), we paid particular attention to UCVs found in minority populations. Four variants (P1614S, T1700A, Q1785H, and E1794D) have been documented in African Americans, two (V1713A and G1788D) in Hispanics (ref. 30; Breast Cancer Information Core Database), and two (L1564P and V1804D) in both ethnic groups ( 30, 31).
Six variants (R1443G, V1534M, D1546N, L1564P, P1614S, and E1644G) that lie upstream of the BRCT domains were investigated for their effect on transcription ( Fig. 2 ). This region displays relatively low conservation across other BRCA1 orthologues with no recognizable structural motif and was therefore expected to be more tolerant to changes ( 32). Variants R1443G, P1614S, and E1644G showed transcription activation levels equal or higher than wt BRCA1 whereas V1534M, D1546N, and L1564P had lower activity (between 60% and 80% of the wt activity) in yeast ( Fig. 2A). In mammalian cells, variants V1534M, D1546N, P1614S, and E1644G showed transcription activation activity comparable to wt (within 1.5 SD) whereas variants R1443G and L1564P had reduced activity (∼55% of wt; Fig. 2B).
Sixteen UCVs in the BRCT domains (S1655F, L1664P, T1700A, G1706E, V1713A, V1736A, G1738R, G1738E, R1753T, L1764P, I1766S, G1788D, Q1785H, E1794D, V1804D, and P1806A) were also tested. Variants S1655F, T1700A, G1706E, V1713A, V1736A, G1738R, G1738E, R1753T, L1764P, and 1766S displayed markedly decreased activity with <40% of the wt activity in yeast and <20% in mammalian cells ( Fig. 2A and B). The Q1785H and E1794D variants displayed activity comparable to wt in both yeast and mammalian cells. Variants L1664P and V1804D showed activity comparable to the wt in yeast but ∼60% to 80% in mammalian cells, whereas variant P1806A showed activity comparable to the wt in mammalian cells but reduced (64%) in yeast cells. Finally, variant G1788D showed between 30% and 40% activity in yeast or mammalian cells.
Expression levels were comparable for all variants in yeast cells ( Fig. 2C). In mammalian cells, several variants showed decreased expression levels, suggesting instability of the protein product ( Fig. 2D). Nevertheless, protein levels in mammalian cells seem to be a poor predictor of overall activity as V1534M or Q1785H, for example, showed very low levels but with activity comparable to wt ( Fig. 2B and D).
As a first approach, we arbitrarily considered >80% and <40% of wt activity as the thresholds to classify the variants as neutral or deleterious, respectively ( 12). Using this threshold, our data indicate that (a) V1534M, P1614S, E1644G, Q1785H, and E1794D do not represent high-risk variants and are likely to be neutral; (b) S1655F, T1700A, G1706E, V1713A, V1736A, G1738R, G1738E, R1753T, L1764P, and I1766S represent deleterious/high-risk variants; (c) variants R1443G, L1546N, L1564P, L1664P, V1804D, and P1806A do not represent high-risk variants, although we cannot rule out the possibility that they may represent moderate-risk variants; and (d) variant G1788D does not represent a neutral variant, although we cannot rule out the possibility that it may represent a moderate-risk variant instead of a deleterious one. Importantly, despite variation in levels in yeast and mammalian assays, no variant presented clearly conflicting results (>80% in one test and <40% in another).
Analysis of double mutants. All previously published assays ( 3, 12, 13, 33) were done in the context of one molecular haplotype that corresponds to the wt sequence (designated as haplotype 1 in ref. 17). However, there are several frequent BRCA1 polymorphisms, such as S1613G, and of 10 common BRCA1 haplotypes, S1613G is present in five ( 17). Common haplotypes containing S1613G variants do not contribute to disease predisposition, and the presence of S1613G has been determined to have no effect on protein function ( 12, 33, 34). However, the role of S1613G has not been analyzed in the context of rare haplotypes. To examine whether the co-occurrence in cis of the S1613G polymorphism with other variants could affect activity, we combined it with neutral variant H1402Y, located at the coiled-coil motif, because it has been found in combination with S1613G (haplotype 2 in ref. 17). We also combined it with a predicted deleterious variant, L1407P, in the same motif ( 12). In addition, we arbitrarily chose M1628V, located in the vicinity of the polymorphism, and two other variants, V1534M and T1685I, located at approximately the same distance from the polymorphism. The presence of S1613G did not affect the activity of deleterious (S1613G/L1407P and S1613G/T1685I) or neutral (S1613G/H1402Y, S1613G/V1534M and S1613G/M1628V) variants ( Fig. 3 ).
Clinical data. Pedigrees corresponding to six variants analyzed here illustrate the difficulties in inferring causality even in large kindreds (Supplementary Fig. S1).
V1534M. The proband in this Italian family (M933) was diagnosed with ovarian and bilateral breast cancer and tested positive for the variant. Loss of heterozygosity (LOH) and sequencing analysis in one of the proband's breast tumors revealed loss of the wt allele. We also analyzed DNA from a histologically cancer-free specimen of breast tissue from the sister diagnosed with breast cancer at age 40 years and it revealed wt alleles only. Therefore, the LOH analyses from these two individuals are not conclusive.
D1546N. In family 230, four individuals were tested and two women with early-onset breast cancer do not carry the variant. However, it is possible that the variant derives from the mother's side of the family and the information from the father's side of the family is not relevant to conclude the absence of segregation.
P1614S. The African American family 2593 presents with six cases of breast cancer and one ductal carcinoma in situ and also carries the H2116R BRCA2 UCV. The proband and her sister with ductal carcinoma in situ tested positive for the variant, suggesting that this variant could account for the disease.
E1664G. In addition to the proband, we tested the mother who was found not to carry the variant. Excluding the possibility of nonpaternity, this suggests that the variant is neutral because no cancer was reported in the deceased father and his family.
V1736A. Three specimens from two independent tumors from the proband of AUS, who had bilateral breast cancer, tested positive for the variant. Two specimens from the left breast tumor showed LOH of the mutant allele and one specimen from the right breast tumor showed no LOH.
V1804D. For family 1008, only two individuals affected with breast cancer were tested and both were shown to carry the variant. The proband's paternal side on family 922 is Native American. All three individuals affected with breast cancer tested carry the variant.
Structural analysis. To structurally rationalize the loss or retention of BRCA1 transcriptional activation in our panel of missense variants, we have divided the variants located at the BRCT domains into four categories: (a) putative disruption of the BRCT fold hydrophobic core; (b) putative disruption of the BRCA1 BRCT interaction with phosphorylated protein partners; (c) putative disruption of binding sites on the protein surface; and (d) no evidence for functional effect ( Table 1 ).
Validation. To evaluate the specificity and sensitivity of the transcription assay, we identified missense variants in the COOH terminus of BRCA1 that have been classified as either deleterious or neutral based on (a) the co-occurrence with other deleterious variants because compound heterozygous mutation leads to embryonic lethality in humans with high statistical probability ( 9, 17) and (b) integrated methods using multifactorial likelihood models ( 9– 11, 17). 17 The thresholds for the overall odds for or against causality in the likelihood models are arbitrary. Thus, we choose to use both highly stringent (neutral: overall combined odds >100:1 against causality; deleterious: overall combined odds >1,000:1 for causality; ref. 10) and less stringent (neutral: >10:1 against causality; deleterious: >10:1 for causality) thresholds. Fourteen variants were classified as neutral and 10 variants classified as deleterious using less stringent threshold ( Table 2 ). A more stringent threshold reduces the number of classified neutral and deleterious variants to 13 and 6, respectively.
The transcription assay correctly classified all 24 variants ( Table 2; Fig. 4 ). Because our values for sensitivity and specificity are both 100%, there is no defined upper bound for our confidence intervals. We can exclude with 95% confidence that the sensitivity is not lower than 69% (based on 10 of 10 samples) or 54% (based on 6 of 6 samples; using a more stringent threshold) and the specificity is not lower than 77% (based on 14 of 14 samples) or 75% (based on 13 of 13 samples; using a more stringent threshold).
Missense variants as a group. One important question about risk determination using functional assays is how to interpret quantitative results. It is possible that risk is inversely correlated with protein activity in a continuous fashion ( Fig. 4A). In that case, variants will present as a continuous series of high-risk, intermediate-risk, and low-risk variants. However, it is also possible that variants will be either high or low risk with virtually no moderate-risk variant found in the population ( Fig. 4B). To begin to address this problem, we have plotted the activity (and its range of variation) of all the 47 variants tested thus far in the transcription activation assay in a quantitative manner ( Fig. 4C). Next, we identified a validated deleterious variant that displayed the highest activity (R1699W; Table 2; Fig. 4C) and a validated neutral variant that displayed the lowest activity (L1564P; Table 2; Fig. 4C). Interestingly, these variants provide a very narrow intermediate range of activity (<50% and >45%). This result suggests that the assay classifies variants either as high or low risk with no variant considered as moderate risk.
Reliable classification of BRCA1 alleles containing missense changes remains a top priority for risk assessment for breast and ovarian cancer. Unfortunately, it seems clear that in the case of most rare variants, no single data source is informative enough to unambiguously classify them into neutral or deleterious ( 10, 19). To contribute to the classification of these problematic alleles, we have developed a functional test based on transcription activation that has provided a reproducible and standardized way to assess the functional effect of these variants on protein function ( 12, 13, 21). The transcription assay is a monitor for the integrity of the BRCT domain, and there is emerging evidence that it can also reliably predict functional effect in a region preceding the BRCT domains including the coiled-coil motif ( 3, 5, 12, 13, 33). Recent experimental evidence has indicated that BRCT domains are specialized motifs that bind phosphorylated peptides ( 23, 35– 41). Importantly, the BRCT region involved in phosphopeptide recognition colocalizes with a region that is critical for transcription activation in the heterologous system we use ( 18) and may underline a structural basis for the correlation between the transcription assay and the integrity of the BRCT domain.
One problem with a quantitative approach for classification is at which activity level one should draw a cutoff value. To address this problem, we analyzed all variants tested to date and identified the ones that had been classified using integrated methods. The identification of the neutral variant showing the lowest activity (L1564P) and of the deleterious variant with the highest activity (R1699W) led to the surprising finding that, given the small interval between these variants, there were only two classes (high risk and low risk) with no intermediate (moderate risk) class ( Fig. 4). This supports the assumption used in the proposition of the integrated method that variants were either high or low risk ( 10). Whereas this may reveal a limitation of the assay, it may also reflect a biological property of BRCA1 alleles. Further studies are needed to investigate this notion with important implications for risk assessment.
Whereas it has been estimated that ∼13% of individuals undergoing testing receive uninformative reports due to the finding of an UCV, the picture is likely to be more somber for members of minority populations. In a recent study, it was shown that African Americans had a larger number of UCVs (46%) than Caucasians (12%; ref. 29). Therefore, we analyzed eight variants (L1564P, P1614S, T1700A, V1713A, Q1785H, G1788D, E1794D, and V1804D) that have been found in African Americans and/or Hispanics ( 30, 31, 42).
Using the threshold of activity defined in Fig. 4 (>50% for neutrals and <45% for deleterious), variants T1700A, V1713A, and G1788D displayed activity compatible with a deleterious classification, whereas L1564P, P1614S, Q1785H, E1794D, and V1804D displayed activity compatible with a neutral classification ( Fig. 4). For the L1564P, P1614S, and V1804 variants, the functional results are consistent with co-occurrence analyses indicating that they are neutral variants ( Table 1; ref. 17). Structural analysis supports a neutral classification for Q1785H and V1804D and a deleterious classification for T1700A, V1713A, and G1788D ( Table 2). Interestingly, the T1700 residue is part of the phosphoserine binding pocket in which it makes hydrogen bonding interactions with the serine hydroxyl group, and it may play an important role in binding specificity ( 35, 36, 41). These results also highlight the difficulty in relying on limited pedigree data that may suggest cosegregation with the cancer phenotype (Supplementary Fig. S1).
We also did functional tests on 14 additional variants and functionally classified six (R1443G, V1534M, D1546N, E1644G, L1664P, and P1806A) as neutral and eight (S1655F, G1706E, V1736A, G1738R, G1738E, R1753T, L1764P, and I1766S) as deleterious ( Fig. 4). The V1534M classification is supported by co-occurrence data ( Table 2) whereas the D1546N classification is supported by clinical data, specifically from family 230 (Supplementary Fig. S1) in which at least two affected women have been shown not to carry the variant. Structural analysis is consistent with the classification of all BRCT variants (S1655F, L1664P, G1706E, V1736A, G1738R, G1738E, R1753T, L1764P, I1766S, and P1806A; Table 1). However, caution should be exercised in the classification until further evidence for these thresholds is obtained.
S1613G is a common polymorphism that displays a wide geographic occurrence but does not make a significant contribution to breast or ovarian cancer risk ( 34). Its high allele frequency, ranging from 0.2 to 0.55 in various populations, makes it probable that haplotypes may exist with S1613G in combination with other neutral or deleterious rare variants. The S1613G variant is present in 5 of the 10 most common BRCA1 haplotypes ( 17). This raised the question of whether there could be an interaction between these missense changes that confer different properties to these haplotypes. We investigated this by comparing naturally occurring and hypothetical haplotypes containing a combination of neutral and deleterious variants. We found that haplotypes containing Gly1613 were no different than haplotypes containing Ser1613 in terms of activity in the functional assays. Although we cannot rule out that the transcriptional assay may not be sensitive to detect small changes and that these haplotypes may behave differently in vivo, our results suggest that S1613G does not modify the risk conferred by other missense variants in cis. This, however, may not hold true for other common polymorphisms.
In conclusion, we have tested a series of common and rare UCVs of BRCA1 with an emphasis on variants found in minority ethnic groups. Combining functional, structural, and clinical information, we classified these alleles as either deleterious or neutral. We have also shown that the S1613G polymorphism does not alter the effect of other rare variants. Importantly, our analysis of 47 variants allowed us to define a tentative threshold of activity for classification and to propose the hypothesis that these rare missense variants confer either high or low risk but not moderate risk. If proved, this hypothesis will have profound implications for genetic counseling.
As a discipline, genetic counseling has seen a tremendous transformation. In the recent past, genetic counseling focused almost exclusively on pediatric syndromes that, by and large, are completely penetrant and do not present confounding phenocopies. Presently, issues of genetic counseling for cancer have come to the forefront. Data from Ontario illustrate this trend, with breast cancer consultations surpassing the number of consultations for all other reasons, including pediatric conditions and fertility issues ( 43). Many important problems surface about the determination of individual risk of cancer when incomplete penetrance and frequent phenocopies confuse the picture, as is the case with BRCA1. Thus, extensive family genotyping, complementary methods for detection, and classification of alleles become paramount.
Grant support: NIH grant CA92309 (A.N.A. Monteiro); NIH Breast Cancer Specialized Program of Research Excellence Award CA116201-P2 (F.J. Couch); American Cancer Society Research Scholar grant (F.J. Couch); Associazione Italiana per la Ricerca sul Cancro/Fondazione Italiana per la Ricerca sul Cancro, Special Project Hereditary Tumors grant (P. Radice); Italian Ministry of Instruction, University and Research grant RBNE014975 (P. Radice); NIH grant F32 GM072403-02 (R. Karchin); NIH grants R01 GM54762 and U01 GM61390, the Sandler Family Supporting Foundation, IBM, Hewlett Packard, Intel, and NetApps (A. Sali); and the Molecular Imaging and the Molecular Biology cores at the H.L. Moffitt Cancer Center.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
We thank the members of the Monteiro Lab for helpful comments; Susan Domcheck, Linda Wadum, Kiley Johnson, and Jenny Mentlick for collecting family data and samples; Carla B. Ripamonti for performing analysis on surgical specimens; and Dana Rollison for help with statistical analysis.
Note: Supplementary data for this article are available at Cancer Research Online (http://cancerres.aacrjournals.org/).
M.A. Carvalho and S.M. Marsillac contributed equally to this work.
Conflict of interest: Results from this work may bear on Myriad Genetic Laboratories commercial test for mutations in BRCA1 and BRCA2. D. Goldgar receives royalties from the University of Utah through its license agreement with Myriad Genetics, Inc.
↵17 D. Goldgar, unpublished data.
- Received September 5, 2006.
- Revision received November 13, 2006.
- Accepted December 15, 2006.
- ©2007 American Association for Cancer Research.