Neoplastic progression is an evolutionary process characterized by genomic instability and waves of clonal expansions carrying genetic and epigenetic lesions to fixation (100% of the cell population). However, an evolutionarily neutral lesion may also reach fixation if it spreads as a hitchhiker on a selective sweep. We sought to distinguish advantageous lesions from hitchhikers in the premalignant condition Barrett’s esophagus. Patients (211) had biopsies taken at 2-cm intervals in their Barrett’s segments. Purified epithelial cells were assayed for loss of heterozygosity and microsatellite shifts on chromosomes 9 and 17, sequence mutations in CDKN2A/MTS1/INK4a (p16) and TP53 (p53), and methylation of the p16 promoter. We measured the expanse of a lesion in a Barrett’s segment as the proportion of proliferating cells that carried a lesion in that locus. We then selected the lesion having expanses >90% in the greatest number of patients as our first putative advantageous lesion. We filtered out hitchhikers by removing all expanses of other lesions that did not occur independent of the advantageous lesion. The entire process was repeated on the remaining expanses to identify additional advantageous lesions. p16 loss of heterozygosity, promoter methylation, and sequence mutations have strong, independent, advantageous effects on Barrett’s cells early in progression. Second lesions in p16 and p53 are associated with later selective sweeps. Virtually all of the other lesion expansions, including microsatellite shifts, could be explained as hitchhikers on p16 lesion clonal expansions. These techniques can be applied to any neoplasm.
Neoplastic progression has long been recognized as an evolutionary process (1) . Cells in neoplasms acquire heritable genetic and epigenetic lesions that affect their survival and reproduction, typically generating clonal heterogeneity (2, 3, 4, 5, 6) with evolution by natural selection (7) . Indeed, most of the alterations posited as hallmarks of cancer provide a selective advantage to the neoplastic cells (8) . However, fundamental principles of evolutionary biology have rarely been used to elucidate factors permissive for clonal expansion of genetic lesions in neoplasms.
Classically, the temporal evolution of somatic genetic lesions during human neoplastic progression has been inferred from frequencies of abnormalities in samples from different patients (9, 10, 11, 12) . For example, one of the best characterized molecular models, human colon cancer, was developed by comparing the frequencies of molecular lesions in adenomas of different sizes and histological grades from different patients (9) . In many conditions, such “intuitive frequency” analyses are the only way to study progression because many premalignant lesions, including colonic adenomas, are removed when they are detected. Although much has been learned about multistage progression from these studies, other model human neoplasms in which multistage evolution can be studied in the same patient spatially and temporally could improve our understanding of the evolutionary dynamics of neoplastic progression.
Barrett’s esophagus (BE) is an ideal model in which to investigate clonal evolution in human epithelial neoplasia in vivo (13) . Patients with BE present with symptoms of gastroesophageal reflux, such as heartburn, and the premalignant epithelium can be visualized at diagnostic upper endoscopy (14) . Multiple biopsies can be safely obtained at endoscopy according to standardized protocols that permit localization of genetic lesions in the premalignant epithelium, and these lesions can be followed over time because periodic endoscopic biopsy surveillance is recommended for early detection of cancer (14 , 15) . We have previously used clonal ordering, a technique in which the order of mutations can be deduced in individual Barrett’s segments, to determine the order in which genetic lesions developed in neoplastic progression (2 , 13 , 16) . We then additionally evaluated candidate lesions, such as 17p (p53) loss of heterozygosity (LOH) and flow cytometric abnormalities, in prospective cohort studies to determine whether they predict progression to esophageal adenocarcinoma (17, 18, 19) . These results indicate that neoplastic progression in BE is associated with a process of genomic instability and progressive clonal evolution similar to that hypothesized by Nowell (1) and Barrett et al. (2) . Genomic instability in BE is associated with widespread LOH, point mutations, alterations in microsatellite alleles (shifts), and epigenetic changes, including hypermethylation of promoter regions (2 , 20, 21, 22, 23, 24) . Furthermore, genetic lesions develop very early during progression in BE; lesions in the p16 (CDKN2A/MTS1/INK4a) tumor suppressor gene are found in >85% of Barrett’s segments (25) .
Clonal expansion of cells with genetic lesions in a tumor suppressor or oncogene increases the probability of progression because it generates a large field of cells in which the subsequent genetic lesions can arise (26, 27, 28, 29, 30) . Mutations that increase the ratio between rates of cell division of a clone and cell death will tend to cause the mutant clone to expand in the neoplasm (31, 32, 33) . When a lesion has spread throughout an entire population it said to have “gone to fixation” because, with no competing alleles left, natural selection cannot change the frequency of that lesion in the population. A “selective sweep” is the phenomenon of natural selection driving an allele to fixation. Therefore, we can recognize selectively advantageous mutations in a neoplasm as those mutations that tend to expand throughout a neoplasm (7) .
Not all of the mutations that have gone to fixation are advantageous. Neutral mutations do not affect the fitness of a cell, but they can spread to fixation through two mechanisms. The first is by genetic drift, a purely random process that occurs very slowly and with low probability relative to a selective sweep (31 , 34 , 35) . The second, more likely mechanism, is through linkage to a selectively advantageous lesion (Fig. 1) ⇓ . Such neutral mutations are called hitchhikers (36) . In any given neoplasm, it is impossible to distinguish whether a genetic lesion has expanded to fixation through selection or as a hitchhiker; however, it is unlikely that the same neutral mutation would co-occur (hitchhike) with a selectively advantageous mutation across multiple independent neoplasms. Thus, the consistent expansion of a mutation in many neoplasms is evidence for an advantageous mutation.
The evolution of a cancer during human neoplastic progression appears to require multiple advantageous genetic lesions (8 , 37, 38, 39) . However, some evolutionarily neutral genetic lesions, such as shifts in the allele sizes of microsatellite markers (i.e., new alleles), which are generally in noncoding regions, have been observed to undergo clonal expansion in BE and a large number of other human neoplasms as have low-frequency LOH events (2 , 22 , 40, 41, 42, 43, 44, 45, 46, 47) . Although it has been postulated that such expansions may be hitchhikers on advantageous lesions elsewhere in the genome, methods have not been available previously to evaluate this hypothesis or distinguish advantageous lesions from hitchhikers on sequential waves of selective sweeps (44 , 45 , 48 , 49) . Thus, analytic methods that distinguish subsequent advantageous lesions from hitchhikers after the initial selective sweep would be useful advances in studying multistage neoplastic progression. Although a second selective sweep will always occur in the background of the initial selective sweep, thus appearing to be a hitchhiker, it should expand even after the initial selective sweep has gone to fixation. Thus, on average, a second selective sweep should appear at fixation more frequently than a hitchhiker.
Here, we present a method for distinguishing selected lesions that can sweep through a BE segment from neutral lesions that may spread as hitchhikers on expansions of the advantageous mutations. This method is used to identify advantageous lesions that occur early during progression in BE. We also describe a method for distinguishing between early and later selective sweeps that may be tested in future prospective studies. It is not always possible to determine the order of events in any one particular BE segment, but by comparing data across many segments we are able to extract regularities in neoplastic progression of BE.
MATERIALS AND METHODS
Patients (Research Participants).
Patients with a diagnosis of BE (211) who had a baseline endoscopic biopsy evaluation as part of the Seattle Barrett’s Esophagus Research Program between January 5, 1995 and December 2, 1999 were eligible as defined by the presence of intestinalized metaplasia in biopsies from at least two levels of the esophagus, an informative (heterozygous) marker in at least one of the microsatellites adjacent to p16 evaluated in at least one sample, clear retention or LOH in one of those microsatellites (see “LOH Analysis” below), and no prior history of esophageal malignancy. The cohort included 37 females and 174 males with a median age 64.7 years (range, 30.5–87.3) at entry into the study. BE was typically a continuous segment and the median segment length, defined by the distance between the ora serrata and the distal end of the tubular esophagus, was 7 cm (range, 2–21 cm). The Seattle Barrett’s Esophagus Study was approved by the Human Subjects Division of the University of Washington in 1983 and renewed annually thereafter with reciprocity from the Institutional Review Board of the Fred Hutchinson Cancer Research Center from 1993 to 2001. Since 2001, the study has been approved annually by the Institutional Review Board of the Fred Hutchinson Cancer Research Center with reciprocity from the Human Subjects Division of the University of Washington.
Endoscopy and Biopsy.
Endoscopy and biopsy were performed as described previously (17) . Patients with a history of high-grade dysplasia (n = 53 patients) had biopsies at 1-cm intervals in the Barrett’s segment, whereas those without a history of high-grade dysplasia (n = 158 patients) had biopsies at 2-cm intervals resulting in a total of 782 biopsies for the 211 patients. Patients diagnosed as negative for dysplasia (n = 46 patients), indefinite for dysplasia (n = 67 patients), and low-grade dysplasia (n = 45 patients) were grouped together because: (a) observer variation studies have shown that low-grade dysplasia cannot be reliably distinguished from negative and indefinite for dysplasia in BE (50, 51, 52, 53) ; (b) prospective evaluation has revealed no significant differences in rates of progression to esophageal adenocarcinoma for negative, indefinite, and low-grade dysplasia (17) ; and (c) ∼70% of low-grade dysplasias regress to no dysplasia compared with only 2–3% that progress to cancer (54, 55, 56) .
Biopsies were processed by DNA content or Ki67/DNA content flow cytometry as described previously (17 , 19 , 57, 58, 59) . Cell cycle analysis (58 , 60) was performed on 1467 flow-purified epithelial cell fractions with a median of 6 sorted fractions per participant (range, 2–20 fractions) depending on the Barrett’s segment length. Cells from the diploid G1 (Ki67+), 4N, and aneuploid fractions were collected for genetic analysis and counted to determine cell population size. In the rare cases where nonproliferating G0 (Ki67−) cells were not separated from G1 cells, we inferred the fraction of G1/(G0+G1) cells from the median value of the cases where G0 cells were separated from G1 cells (16.7%).
DNA Extraction and Whole Genome Amplification.
DNA was extracted from the flow-purified cell populations using either standard phenol-chloroform or the Puregene DNA Isolation kit as recommended by the manufacturer (Gentra Systems, Inc., Minneapolis, MN). Whole genome amplification using primer extension preamplification was performed as described previously (57 , 61) for each sorted fraction and three constitutive controls per participant.
LOH analysis was performed on the flow-purified fractions, as described previously (57 , 62) . Locus-specific primers were labeled with FAM, TET, or HEX from Research Genetics (Huntsville, AL). LOH data for chromosomes 17 and 9 were obtained from 211 patients (1389 and 1333 informative sorted fractions, respectively). Nineteen microsatellite loci were evaluated for LOH, including the 17p arm loci D17S1298 (3.87 Mb), D17S1537 (6.10 Mb), TP53-ALU (AAAAT)n in intron 1 (7.77 Mb), TP53 (CA)n (7.77Mb), D17S786 (9.01 Mb), D17S974 (10.72Mb), D17S1303 (11.06 Mb), chromosome 17q loci D17S1294 (28.53 Mb), D17S1293 (32.71 Mb), D17S1290 (56.81 Mb), D17S1301 (73.28 Mb), chromosome 9p loci D9S2169 (5.19 Mb), D9S935 (5.19 Mb), D9S925 (18.28 Mb), D9S932 (24.43 Mb), D9S1121 (25.39 Mb), D9S1118 (31.92 Mb), and chromosome 9q D9S301(69.26 Mb), and D9S930 (110.62 Mb). Physical map locations were determined from the University of California Santa Cruz version hg16 July 2003 assembly. 3 QLOH = (Tumor allele ratio)/(Normal allele ratio) was determined for each locus. QLOH values ≤0.4 or ≥2.5 were considered to be clearly indicative of LOH, as described previously (18 , 57 , 62) . QLOH values between 0.80 and 1.25 were considered to be clearly indicative of retention of heterozygosity.
DNA Methylation Analysis.
Genomic DNA from flow-purified Barrett’s epithelium was evaluated for p16 promoter methylation in 276 flow-purified fractions from 109 patients using a modification of the methods for bisulfite treatment and methylation-specific PCR as described previously (23 , 25 , 63) . Human genomic DNA treated in vitro with SssI methyltransferase (New England Biolabs, Beverly, MA) was used as the methylated control. In a subset of cases, promoter methylation was determined and/or verified by directly sequencing PCR products of bisulfite-treated genomic DNA as described previously (25) .
Genomic or primer extension preamplification DNA was sequenced using either BigDye or BigDyeV3 Terminator cycle sequencing (Applied Biosystems, Foster City, CA) on an ABI 377, 3730, or 3700 DNA sequencer. Wild-type sequences for each participant were confirmed using constitutive samples. All of the mutations were confirmed by at least two independent PCR and sequencing reactions, and in cases of ambiguity, by direct sequencing of genomic DNA. Evaluation of mutation of exons 5–9 of the p53 gene was performed on 766 flow-purified fractions from 176 patients using conditions described previously (64) . Mutation analysis of exon 2 of the p16 gene was performed on 1042 flow-purified fractions from 203 patients using an aliquot of genomic DNA that had undergone whole genome amplification (primer extension preamplification), as described previously (25) .
Calculating the Expanse of a Lesion.
We define the frequency of a lesion within a BE segment as the expanse of that lesion, to distinguish it from the frequency of a lesion across patients. We define the expanse of a lesion as the frequency of that lesion among the cells sampled from a Barrett’s segment (range, 0–1), excluding the G0 diploid cells. The cells in G0 include the nonepithelial, stromal cells, and other cells that are no longer proliferating. Formally, the expanse Ei of a lesion in locus i for a given participant was defined as, where, Bi is the set of biopsies taken from that segment for which there are data for locus i. We divide by the total number of biopsies |Bi| from the participant to normalize for the size of the Barrett’s segment and derive a frequency value for a lesion between 0 and 1. In addition, we require that more than one sorted sample be informative for locus i (|Bi| >1) to improve the estimate of the expanse Ei. We define pij as the proportion of Ki67+ or nondiploid cells in biopsy j with a lesion in locus i. Although we have measured previously the number of levels over which lesions have spread (64) , we have revised our definition of expanse because the analysis of evolution depends on measurements of allele frequencies in populations.
Finding an Initial Selective Sweep of an Advantageous Lesion.
The algorithm (Fig. 2) ⇓ first measures the expanse of each lesion in each locus in each Barrett’s segment. LOH lesions in alternate (maternal versus paternal) alleles from different sorted fractions were considered to be different expanses, as were different sequence mutations and ploidy differences >0.2n. A histogram was constructed for each locus with a bin for each 0.1 range of expanses from 0 to 1. Fixations were defined to be lesions with expanses of 0.9–1 (90–100%) in the segment. The algorithm then infers that the lesion most frequently expanded to fixation in the cohort is the one most likely to be advantageous for a clone. It removes all of the expanses from the data that can be explained as hitchhikers on that lesion. If two different lesions have gone to fixation the same number of times in the cohort then the algorithm chooses the one that has the most partial expanses in the cohort.
We identify a potential hitchhiker as a lesion that only occurs in cells that also potentially carry the putative advantageous lesion (Fig. 1) ⇓ . If a lesion occurs in a group of cells that do not carry the putative advantageous lesion, its presence in those cells must be independent of the effects of the putative advantageous lesion (Fig. 1) ⇓ . After identifying a putative advantageous lesion, the algorithm examines every expanse of every locus in every participant. If it finds a sorted fraction that has clear retention of heterozygosity (0.80 ≤ QLOH ≤1.25) in the putative advantageous locus but clear LOH (QLOH ≤0.4 or QLOH ≥2.50) in another locus, that lesion is defined as a nonhitchhiker. The same rule holds for sequence mutations and promoter methylation. Only nonhitchhiker lesions are retained, and all of the other expanses are removed from the data. This rule for retaining nonhitchhikers is robust to missing data.
Once the hitchhikers of the putative advantageous lesion have been removed, we repeat the analysis, looking for the next most frequently fixed lesion and filtering out its potential hitchhikers. The repetition of the analysis on the remaining data allows us to detect all of the lesions that may have independent advantageous effects early in progression. A flow chart for the algorithm is shown in Fig. 2 ⇓ .
Finding a Subsequent Selective Sweep.
If there has been a second selective sweep in a BE segment, it would be filtered out of the data set as a potential hitchhiker by the algorithm in Fig. 2 ⇓ . The process of finding a second selective sweep is identical to that of finding the first selective sweep when the data are restricted to the expanses that occur within the first selective sweep. Thus, we can use the same algorithm but restrict it to the expanses that occur within the genetic background of the first sweep to search for lesions that have gone to fixation more frequently in this background than would be expected if they were neutral hitchhikers.
The Perl source code for the analysis of the data are freely available upon request from the corresponding author.
The frequency of fixation for a lesion at a locus was defined to be the number of fixations (Ei >0.9) divided by the number of patients that were informative at the locus. We followed the method of Barrett et al. (21) iterating a binomial regression to estimate the background fixation rate and excluding outliers. We used the general linear model in the statistics package R, 4 with each locus weighted by the number of informative patients. The fixation rate for all of the loci combined was first estimated. Then we excluded loci with lesions that were fixed more frequently than would be expected by the background rate (P < 0.001) based on the Binomial pb of the observed number of fixations among the informative patients. We then repeated the estimation of the background fixation rate excluding the lesions that went to fixation significantly above the background rate and again excluded any new outliers based on the new estimated background fixation rate. This continued until no additional outliers could be identified.
Bootstrapping (65) was used to estimate the distribution of possible background fixation rates. We generated alternative data sets by sampling patients from the cohort 211 times, with replacement. The above method was used to estimate the background fixation rates for 1000 bootstrap data sets, and a 95% confidence interval was calculated from the results. The median value from the bootstrap results was finally used as an estimate of the true background fixation rate, and any lesions that were fixed more often than expected at P < 0.001 were then identified as potentially advantageous lesions. The same methods were used to evaluate putative hitchhiker fixations, which were defined as any lesion that went to fixation in a patient that also had a p16 mutation, p16 methylation, or p16 LOH (LOH in D9S925 or D9S932) lesion at fixation as described above (Fig. 2) ⇓ . p16 mutation and methylation lesions were only considered potential second hits if they expanded in a clone that also had a p16 LOH lesion.
A permutation test (66) was performed on the microsatellite shift data by randomly permuting the shift data and testing if multiple shifts tended to occur in the same locus in the same patient in the permuted data as frequently as they do in the real data. The permuted data, by construction, has the same frequency of microsatellite shifts as the real data, as well as the same distribution of flow sorted fractions per patient. Thus, it represents the null hypothesis that microsatellite shifts are occurring randomly and independently in the different flow sorted fractions and do not represent clonal markers. A permutation was generated by first reassigning the observed shifts to loci in flow sorted fractions chosen randomly with uniform probability across all of the informative loci in our data set. If the same locus in a flow sorted fraction was chosen twice, then that choice was rejected and another random locus was chosen from our data set. The randomized shifts were then measured for the percentage of loci that included more than one shift within the same participant relative to the total number of loci that included at least one shift. This statistic was then compared with the observed frequency of loci with multiple shifts. There were 10,000 such permutations generated to estimate a P.
The association between shift patterns and p16 lesions was tested with a χ2 test of the 2 × 2 table counting the number of samples with the possible combinations of p16 lesion/wild-type and the presence or absence of a shift in a microsatellite marker.
We observed a variety of different relationships among expanses of p16 lesions, p53 lesions, and microsatellite shifts (Fig. 3) ⇓ . Clones with p16 lesions have gone to fixation more than any other lesions in the cohort (Fig. 3, A–G) ⇓ . In all but 1 case in our cohort, p53 lesions occur within or are coextensive with p16 lesion expanses.
LOH at D9S925, which is linked to p16, was the lesion that went to fixation in the most patients in the cohort. Thus, it was the first putative advantageous lesion identified by the algorithm (Fig. 2) ⇓ . Subsequent iterations of the algorithm identified methylation of the p16 promoter, p16 sequence mutations, and LOH at D9S932 (linked to p16) as advantageous lesions. The number of fixations for lesions in each locus, normalized by the number of patients informative for each locus, is shown in Fig. 4A ⇓ . The frequencies of partial expanses (<90% of the proliferating cells) are shown in Fig. 5A ⇓ . Fig. 4B ⇓ and Fig. 5B ⇓ show the frequencies of fixations and partial expanses after the algorithm has filtered out expanses that may be explained by p16 lesions (LOH in D9S925, p16 methylation, p16 mutations, and LOH in D9S932). Of all expanses, 97.5% (1113 of 1141) can be explained by p16 lesion expanses. The full data for the number of observed expanses of different sizes and the filtering steps of the algorithm appear in the appendix.
Lesions in p16 tend to expand to fixation more frequently (43.5% of the cohort) than any other locus evaluated (Table 1 ⇓ ; Fig. 4 ⇓ ). When we drop the restriction on the cohort that they must have an informative locus flanking p16, chromosome 9p (p16) LOH, mutation, and methylation remain the most frequently fixed lesions (data not shown), and so they are selected by the algorithm as the most likely to be advantageous among the loci analyzed. The only difference is that LOH at the D9S1118 locus (closely linked but not adjacent to p16) is chosen first as the lesion that most frequently went to fixation. This is probably because D9S1118 has the highest frequency of heterozygosity among the microsatellites on chromosome 9p in our cohort.
Only one Barrett’s segment has a p53 lesion at fixation without apparent fixation of a p16 lesion (Fig. 3H) ⇓ . In this case, an aneuploid clone with p53 lesions has reached fixation (94% of cells) in a Barrett’s segment that also contains a separate, small (6% of cells) diploid clone with p16 mutations. Three additional cases each show a p16 lesion in all of the sorted fractions that have a p53 lesion, but with 1 additional fraction having no p16 lesion and no data for p53. These 3 cases are thus categorized as p53 fixations but not as p16 fixations in Table 1 ⇓ , but there is no evidence that p53 lesions arose before p16 lesions in these cases. p16 and p53 fixations do not appear to be independent of each other (χ2 = 14.5; P < 0.001). If our observation that p16 lesions go to fixation in 43.5% of Barrett’s segments is representative of the true rate of p16 lesion fixation, the probability that 14 of 15 p53 lesion fixations would co-occur with p16 lesions by chance is <0.001 (Fisher’s exact test).
From our bootstrap analysis of the iterated binomial regression, we estimated the background (hitchhiker) rate of fixation as a binomial distribution with pb = 0.0113 (95% confidence interval, 0.0067–0.0168). Fixation of p16 methylation, p16 mutations, and 9p (p16) LOH occurred more often than would be expected based on the median bootstrap background rate of pb = 0.0113 (exact Binomial P < 0.001). Fixation of 17p (p53) LOH and p53 mutations also occurred more often than expected (exact Binomial P < 0.001; Table 2 ⇓ ). There was no evidence that aneuploid nor tetraploid (4N fraction >6%) clones tended to expand to fixation more often than expected (exact Binomial P > 0.05).
Microsatellite shifts were found at a frequency of 0.029 across all genotypes, and 50% (106 of 211) of patients had at least one shift among the 19 microsatellites evaluated. Of all microsatellite shifts, 70% (168 of 240) appeared in more than one sorted fraction from a patient. Shifts occurred in multiple samples in an individual at a significantly higher frequency than expected if the shifts were random (Permutation test, P < 0.001). On the basis of samples for which we have complete data for p16 lesions, 94% (17 of 18) of patients have shift expanses that can be explained as hitchhikers on the expansion of p16 lesions (e.g., Fig. 3, D and E ⇓ ), with the exception of the patient in Fig. 3H ⇓ . Shift patterns were detected in association with a p16 lesion more than would be expected at random (X2 = 14.6; 1 degree of freedom; P < 0.001, including samples with incomplete p16 data.).
Expanses removed by filtering (illustrated in Figs. 4 ⇓ and 5 ⇓ ) may be analyzed to detect lesions that have frequently gone to fixation in tandem with or in the genetic background of a p16 lesion. The algorithm identifies p53 sequence mutations as potential lesions associated with a later selective sweep. There were 9 cases in which a p53 mutation spread to fixation in the genetic background of a p16 lesion. After further filtering of expanses that may be hitchhikers on this potential late p53 mutant selective sweep, few expanses remain (see Appendix).
Among the expanses that were identified as potential hitchhikers on p16 lesions, bootstrap estimates of the binomial regression gave a background fixation rate of pb = 0.0122 (95% confidence interval, 0.0076–0.0174). This is the estimate of the probability that a lesion will hitchhike on the first selective sweep. Given the median background fixation estimate of pb = 0.0122 among the potential hitchhikers, fixations of p16 mutations, p16 methylation, p53 mutations, and 17p (p53) LOH occurred more often than would be expected (Table 3 ⇓ ; exact Binomial P < 0.01).
There is evidence that both p16+/− and p16−/− clones expand. We have 4 assays for lesions in p16, including LOH at the flanking microsatellite markers D9S932 and D9S925, sequence mutations, and methylation of the promoter. Using these 4 assays, 91 patients have a p16 lesion that has swept to fixation. Ten patients represent clear examples of p16−/− clones that have gone to fixation. Fifty-eight patients are missing data for the second allele throughout the segment although a p16 lesion in the other allele has reached fixation. In the remaining 23 patients, the p16 LOH lesion that has gone to fixation has not acquired a lesion in the second allele (p16+/−) in at least some of the sorted fractions. The rest of the Barrett’s segment in these 23 patients is either p16−/− or p16+/−; missing data makes it impossible to say if the entire segment is purely p16+/− or if an additional lesion has generated a p16−/− subclone.
A central task for evolutionary biology is to distinguish the regularities of selected advantageous genetic lesions from the historical accidents of selectively neutral lesions. By the time a neoplasm evolves malignancy, it typically has acquired a large number of genetic lesions (67 , 68) . Stoler et al. (67) estimate an average of >104 lesions in a colorectal carcinoma cell. Allelotype studies in esophageal adenocarcinomas have detected LOH on virtually every chromosome arm (21 , 22 , 46 , 47) . How could all of these lesions undergo clonal expansion? We hypothesize that the vast majority of genetic lesions detected in cancers have little, if any, selective advantage that promotes progression to malignancy but are instead historical accidents, hitchhikers on selectively advantageous lesions that arose during progression. Because we take multiple biopsies from each segment, our algorithm effectively carries out a clonal ordering analysis (2 , 13 , 16) on each of the 211 Barrett’s segments. In BE, we report that p16 lesions frequently spread to fixation early in progression, and p53 lesions expand only in the background of p16 lesion expanses, with a single exception. In cases that have both p16 and p53 lesions at fixation, clonal ordering cannot determine whether one preceded the other in two selective sweeps or both lesions developed as early events in a single sweep. However, fixation of a p53 lesion does not appear to be independent of p16 lesion fixations (χ2 P < 0.001). Our data suggest that both p53 lesions and second hits in p16 are advantageous and drive selective sweeps because they occur more frequently than expected by chance. All of the other lesions we have evaluated can be explained as hitchhikers on the selective sweeps of p16 lesions. These results may be tested by future prospective investigations that track expansion of clones in patients with BE to determine the temporal dynamics of clonal evolution.
There appears to be strong selection for p16 lesions in BE because p16 lesions go to fixation more frequently than any other locus we have assessed, and previous studies have reported p16 lesions in up to 90% of Barrett’s segments of all histological grades (25 , 69) . Both p16+/− and p16−/− genotypes seem to be selected, and second hits in p16 go to fixation more frequently than would be expected if they were hitchhikers (P < 0.01), suggesting two selective sweeps, as illustrated in Fig. 6 ⇓ . Our results are consistent with other studies showing expansion of p16 lesions in other neoplastic conditions (49 , 70 , 71) . If BE patients included subpopulations characterized by different genetic pathways to cancer, we would have observed hitchhiking expanses that could not be explained by p16 lesions. The algorithm would also have identified other selectively advantageous lesions on chromosomes 9 and 17 after it had filtered out the expanses explainable by p16 lesions. The fact that clones with p16 methylation are found at fixation, analogous to clones with p16 LOH and sequence mutations, suggests that in BE p16 methylation is acting as a clonal marker and that p14ARF is not the target of early alterations in BE. p14ARF is not hypermethylated in most cases and 91% (30 of 33) of cases with p16 point mutations left p14ARF unaltered or caused a conservative amino acid change. 5 Although it is possible that p16 lesions are themselves hitchhikers on an as yet unidentified advantageous lesion elsewhere in the genome, this hypothesis is not well supported by the data. If p16 lesions are hitchhikers on another advantageous lesion, then either different p16 lesions would have to be so ubiquitous as a mosaic in a Barrett’s segment that the advantageous lesion nearly always arose in a p16 lesion background or the advantageous lesion nearly always acquired a p16 lesion early in progression. It seems unlikely that the p16 locus would be highly susceptible to unselected background lesions generated by three different mechanisms (mutation, methylation, and LOH). Our data suggest that during BE progression loss of each of the two p16 alleles predisposes to a selective sweep. The prediction that p16−/− clones will expand in the background of p16 heterozygous (+/−) clones can be tested in prospective studies.
Others have noted apparent selection of what appear to be neutral lesions during human neoplastic progression and have speculated that these may occur in the background of a selectively advantageous mutation (44 , 45 , 48 , 49) . However, methods have not been available previously to distinguish advantageous and neutral mutations. Evolutionarily neutral lesions arising in a p16-deficient clone will tend to spread throughout the segment as hitchhikers on the p16 selective sweep. The most obvious cases of such hitchhiking are the large expanses of microsatellite shifts. A shift may expand to fixation if it occurs before a selective sweep (e.g., Fig. 3D ⇓ ). Alternatively, it may occur during the sweep and so appear as a partial expanse (e.g., Fig. 3E ⇓ ). It is impossible to exclude the possibility that expanses of some lesions may be generated by hitchhiking on a selective sweep driven by an unmeasured lesion elsewhere in the genome. Only ∼90% of BE segments have p16 lesions, and a lesion in another as yet unidentified locus or pathway may drive selective sweeps in the remaining 10% of the cases. For example, we found a single patient with a microsatellite shift but no p16 lesions in the same sorted fraction. This is the same patient with the only p53 lesion expanse that cannot be explained by a p16 lesion. However, our data show that the vast majority of expanses in our cohort can be explained as hitchhikers on p16-mediated sweeps, without hypothesizing an unmeasured selective sweep.
p53 lesions appear to be selectively advantageous later in BE progression in the background of p16 lesions. We commonly observe p16 lesion expansions alone (e.g., Fig. 3, A–E ⇓ ), p53 lesion expansions within p16 lesion expansions (e.g., Fig. 3F ⇓ ), and the combination of both p16 lesions and p53 lesions at fixation (e.g., Fig. 3G ⇓ ). However, we observed p53 lesion expansion alone in only a single case, and no p16 expansions within p53 lesion expansions (Figs. 4 ⇓ and 5 ⇓ ). In the single exception with p53 expansion in a wild-type p16 background (Fig. 3H) ⇓ , both alleles were inactivated (p53−/−), and it is possible that rarely, two p53 lesions or a dominant-negative mutation can develop conferring a selective advantage without a p16-mediated clonal expansion. The p53 mutation observed in the exceptional case was a 10-bp deletion that created a frame shift truncation and has not been reported previously in the p53 mutation databases (72) . Alternatively, lesions elsewhere in the genome may be advantageous for clonal expansion in the 10% of Barrett’s segments without p16 lesions, and this may be an example of a less common pathway to cancer that bypassed the p16 tumor suppressor gene. Due to the stochastic nature of evolution and the potential for modifier and other selective loci, it is likely that progression in some individuals will not conform to the predictions of the model. However, we may then focus on those patients to discover alternative pathways in neoplastic progression.
One explanation of our results is that there is little or no selective pressure to lose p53 until later in progression when telomere shortening or other mechanisms of DNA damage lead to p53-mediated cell cycle arrest or apoptosis (73) . Alternatively, the primary phenotype of loss of p53 in BE may be genomic instability (74, 75, 76) . If this is correct, then there would typically be a delay between the loss of p53 and the generation of a selectively advantageous lesion that would carry the p53 lesion to fixation as a hitchhiker. A third hypothesis is that loss of p53 may be a selectively neutral event caused by a decrease in the length of the short telomere on the p arm of chromosome 17 (77) where p53 resides, but it is hard to imagine that loss of p53 could be entirely neutral given its roles in cell cycle control and apoptosis (73) . Our data are consistent with a later selective sweep driven by p53 loss as seen in 15 cases of sweeps of p53 lesions (mutations and LOH). These sweeps occur more frequently than would be expected if they were hitchhikers (P < 0.001). One hypothesis that has been proposed for such observations is that later genetic lesions, such as those in p53, can only reach fixation in the background of an earlier, “gatekeeper” lesion, such as those in p16 (78) . For example, a p53 lesion may be selectively advantageous within the stem cell compartment of a Barrett’s crypt but may not be able to spread and establish other mutant crypts in the absence of p16 lesions. In this way, clones with p53 lesions would remain too small for us to detect until they acquired a p16 lesion, and we would not observe them early in progression. In BE, prospective studies should be able to distinguish between cases where a clone with a p53 lesion is actively expanding, remaining a constant size, or perhaps even contracting. For example, we predict that clones with both p16 and p53 lesions will tend to expand in competition with p53 wild-type clones. In contrast, we predict that p53 mutant clones will not typically expand in a p16 wild-type genetic background.
BE is defined by intestinal metaplasia in the esophagus (15) . However, BE also fulfills all of the criteria for the medical definition of a benign neoplasm (79) because it typically consists of large, clonal populations (25) , is hyperproliferative relative to normal esophageal and gastric epithelium (58 , 80 , 81) , and is progressive (2) . The evolution of clones that we report here is also consistent with neoplasia, and recognition of BE as a benign neoplasm may improve our understanding of the basis for its precancerous potential. The study of evolution requires measurement of changes in allele frequencies, and we did not assess the relationship between genotypes and dysplastic phenotypes because the two were generated on different samples that could not be directly compared quantitatively, although the relationships among p16, p53, and cytometric abnormalities and dysplasia have been published previously by us and others (2 , 17 , 18 , 25 , 69 , 82, 83, 84, 85, 86) .
Here, we describe a method by which advantageous mutations may be distinguished from hitchhikers during multistep human neoplastic progression. Our present analyses extend our previous work by systematically measuring regularities in the data across 211 research patients without cancer, 782 biopsies, and 1467 flow-sorted populations. This method of analysis can be generalized to many other neoplasms in which clonal fields can be evaluated spatially, including head and neck, bladder, and lung cancer among others. In fact, these techniques can be applied to any neoplastic condition in which relatively pure clones can be isolated, their expanses measured, and many neoplasms can be assessed. We achieve this in BE through acquisition of multiple biopsies per segment and flow cytometric purification of cells in the biopsies. If only one biopsy is available per neoplasm, it is still possible to make some inferences of advantageous lesions by identifying those lesions that are detected in more neoplasms than one would expect if the lesions were evolutionarily neutral (87 , 88) . Evaluation of precancerous neoplasms, rather than the more genetically abnormal cancers, can reduce the number of hitchhiking neutral mutations that must be filtered to identify advantageous lesions. Bladder cancer is another excellent example, and previous studies have assayed multiple samples from resection specimens for genome-wide LOH analysis, identifying 33 regions associated with progression and cancer (89, 90, 91) . These analyses, which have extensive spatial and genomic mapping, show clonal expanses with LOH in chromosomes 9p and 17p and are consistent with our results. We have emphasized analysis of a large number of patients to facilitate the distinction between advantageous and hitchhiker lesions.
In summary, we have shown that p16 lesions predispose to as many as two selective sweeps within a Barrett’s segment as each allele is inactivated. We have also found evidence for subsequent sweeps by clones that have acquired p53 lesions in a p16-deficient genetic background. These observations imply that competition exists among different clones within the Barrett’s segment, and our data thus provide direct evidence for the concept articulated more than 25 years ago that evolution drives human neoplastic progression (Fig. 6 ⇓ ; Ref. 1 ). The predictions of our model may be investigated in longitudinal studies to determine how frequently the regularities we inferred from these cross-sectional data also occur temporally.
We use a filtering algorithm to identify lesions that may be selectively advantageous for clones in BE and then remove expanses of other lesions that may be explained as hitchhikers on those advantageous lesions. In four iterations, the algorithm picks out LOH at both flanking microsatellites of p16 (D9S925 and D9S932), methylation of the p16 promoter, and p16 sequence mutations as advantageous lesions. The first locus identified as a potentially advantageous lesion is LOH in D9S925, which is the most closely linked microsatellite on the telomeric side of p16. This locus was found to reach fixation most frequently, as shown in the “Unfiltered” column of ⇓ . Once expanses in the other loci that could be explained as hitchhikers had been filtered out, as shown in the “D9S925” column of ⇓ , p16 promoter methylation was the next most common lesion assayed at fixation. After filtering out expanses that can be explained by p16 methylation as shown in the “p16 meth” column, the next most common lesion reaching fixation was LOH at D9S932, the most closely linked microsatellite on the centromeric side of p16. After filtering on LOH at D9S932, p16 sequence mutations (“p16mut”) were the most common lesions reaching fixation, as shown in the remaining columns of ⇓ .
We exclude any loci from the filtering that were identified as loci with advantageous lesions in previous rounds of filtering. Thus, the rows of data for loci identified in the column headers of ⇓ and ⇓ do not change in subsequent columns.
The expanses filtered out of ⇓ as potential hitchhikers may be analyzed with the same algorithm to search for lesions that frequently go to fixation in the genetic background of a p16 lesion ⇓ . The algorithm identifies p53 mutations as a potentially advantageous lesion in clones that also have a p16 lesion. The last column of ⇓ shows the expanses within p16 expanses that cannot be explained as hitchhikers on a later selective sweep of a p53 mutant clone.
Grant support: NIH P01 CA91955, NIH R01 CA61202, NIH K01 CA89267-02, NIH K07 CA89147-03, NSF ANIR-9986555, ONR N00014-99-1-0417, DARPA AGR F30602-00-2-0584, the Intel Corporation, and the Santa Fe Institute.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Requests for reprints: Carlo Maley, Division of Human Biology, Fred Hutchinson Cancer Research Center, Seattle, WA 98109. Phone: (206) 667-4615; Fax: (206) 667-6132; E-mail:
↵3 Internet address: http://genome.cse.ucsc.edu.
↵4 Internet address: http://www.r-project.org.
↵5 T. G. Paulson, G. Longton, P. C. Galipeau, C. C. Maley, L. J. Prevo, H. Kissel, P. L. Blount, and B. J. Reid. p16 alterations allow evasion of the tumor suppressor effects of intestinal epithelium by clonal spread in Barrett’s esophagus, manuscript in preparation.
- Received October 16, 2003.
- Revision received February 19, 2004.
- Accepted March 1, 2004.
- ©2004 American Association for Cancer Research.