
[Cancer Research 63, 7094-7097, November 1, 2003]
© 2003 American Association for Cancer Research
Power Law Distribution of Chromosome Aberrations in Cancer1
Attila Frigyesi,
David Gisselsson,
Felix Mitelman and
Mattias Höglund2
Centre for Mathematical Sciences, Mathematical Statistics, Lund University [A. F.], and Department of Clinical Genetics, University Hospital, Lund [D. G., F. M., M. H.], Sweden
 |
ABSTRACT
|
|---|
Cancer cells are characterized by having aberrant chromosomes. The number of aberrations and the specific chromosomes affected are correlated with tumor progression. We show that for breast, colorectal, and renal cell carcinomas the distribution of the number of such aberrations per tumor follow a power law distribution with an exponent close to unity. We present two stochastic models that in simulation experiments result in power law distributions of the number of changes per tumor. The first model is based on a multiplicative fluctuation process and the second on a preferential attachment principle linked to an observation process, i.e., a tumor detection and treatment process. Because almost identical power law distributions are seen in breast, colorectal, and renal cell carcinomas we suggest that the obtained distributions are consequences of a common mechanism operating in malignant epithelial tumors.
 |
Introduction
|
|---|
A hallmark in of cancer development is the accumulation of genetic lesions. Some of these may be seen in the form of chromosomal aberrations when the cells enter mitosis. During the past decades a wealth of information on chromosomal changes in cancer cells have accumulated,3
and based on the collected data two major categories of tumors may be discerned, those with simple and those with complex karyotypes. The former category is dominated by leukemias, the latter by carcinomas. Carcinomas also show a high degree of variability, both regarding the number of changes per tumor and the spectrum of recurrent imbalances (1)
. Only rarely are balanced primary, disease-specific aberrations seen. An acquired chromosomal instability has been suggested to generate the complex characteristics of these karyotypes (2)
. In recent investigations (3, 4, 5)
we have used several statistical tools (6)
to analyze the patterns of chromosomal changes in carcinomas. By these methods we were able to identify several general characteristics of carcinoma karyotypes, and it was found that the distributions of the number of imbalances per tumor follow monotonously decreasing functions, indicating that a common process causes most levels of karyotypic complexity. In these investigations we limited the analysis to chromosomal imbalances present above a given frequency. Although this strategy simplified the interpretations, it will, however, affect the shape of the distributions by excluding cases with many changes; the maximum number of imbalances present in a case being limited by the number of imbalances included in the investigation. To eliminate this inadequacy we retrieved all karyotypes of BCs,4
CCs, and RCCs, from the Mitelman Database of Chromosome Aberrations in Cancer3
and scored the NAPT by counting the number of entries in each karyotype. We show that the distribution of the number of chromosome aberrations per tumor in all three of the tumor types follows a power law distribution with an exponent close to unity, suggesting that the obtained distributions are generated by a common mechanism.
 |
Materials and Methods
|
|---|
Data, Distribution Fitting, and Simulations.
The NAPT was estimated by scoring the number of entries in each karyotype. Although some entries may represent more than one event, e.g., three-way translocations, such cases are rare, and the NAPT measure gives a good estimate of the number of changes present in each tumor. The data set included 667 BC karyotypes with a total of 4,564 aberrations, 533 CC karyotypes with a total of 3,533 aberrations, and 673 RCC karyotypes with at total of 4,574 aberrations. Lognormal, geometric, exponential, simple power law with the probability density function (p.d.f.) = a x x-a, and the Zipf-Mandelbrot distribution with the p.d.f. = a x (x+b)-
were fitted to the BC, CC, and the RCC data, respectively. Maximum likelihood estimators were calculated for the geometric, exponential, and lognormal distributions, whereas the power law and the Zipf-Mandelbrot distributions were fitted in a least squares sense. The multiplicative fluctuation process was simulated for 100,000 tumors using the model NAPT(t +1) =
xNAPT(t) with 
Exp(0.1)+1, i.e., the number of aberrations at generation t +1 is equal to the number of aberrations at generation t multiplied by a decaying random factor >1. The preferential attachment process with an observational process was simulated for 100,000 tumors with P[NAPT(t +1) = NAPT(t)+1] = NAPT(t)/30, i.e., a tumor progressing from generation t to generation t +1 acquires an additional aberration with a probability directly proportional to the number of aberrations at generation t. An exponentially decreasing number of processes were observed and killed with each generation, simulating the clinical course of tumor detection, surgical removal, and cytogenetic analysis. The mean of this exponential process was 400.
Temporal Analysis.
To obtain a value for the time of appearance of a chromosomal change, all of the tumors with the given change were selected and the distribution of the number of changes per tumor plotted. An aberration that is seen frequently in low-complex karyotypes, and, thus, would be defined as early in the karyotypic evolution, will produce distributions with peak frequencies at low values of the number of changes per tumor, whereas changes occurring late in the karyotypic evolution would produce peak frequencies at higher values. Because these frequency distributions often are skewed, the mean is not a good estimate of the TO. Instead the peak values, the modes of these distributions were used as the TO (6)
. TO is, thus, a function of karyotypic complexity. To estimate how restrained a given aberration is to the estimated TO value, a measure for TO variability would be needed. To obtain such a measure, the selected distributions were resampled with replacement (bootstrapped) 1000 times and the mode scored after each resampling. The 25th and 75th percentiles of the bootstrapped modal values were then used as a measure of TO variability.
 |
Results and Discussion
|
|---|
The relative frequencies of the NAPT values are plotted for each tumor type in Fig. 1
. Because the obtained distributions may be described by monotonously decreasing functions with heavy tails, the data were tested for meeting the requirements of an exponential distribution, a geometric distribution, a simple power law distribution, a generalized power law distribution (the Zipf-Mandelbrot distribution), and a lognormal distribution, respectively. The observed data showed a best fit to the simple power law and the Zipf-Mandelbrot distributions. The estimated
values were 1.25, 1.05, and 0.86, for BC, CC, and RCC, respectively, and the corresponding SDs 0.25, 0.20, and 0.26. Hence, assuming normality, the estimated values for
did not differ significantly. The BC, CC, and RCC data sets were subsequently pooled and the obtained NAPT distribution used to produce a better estimation of
. The pooled data showed the best fit to a simple power law distribution with
1.05. These results strongly suggest that the distribution of the number of acquired changes in the carcinomas studied follows power law statistics, P(NAPT)
NAPT-
where
is close to unity.

View larger version (54K):
[in this window]
[in a new window]
|
Fig. 1. The distributions of the NAPT in (A) breast carcinomas, (B) colorectal carcinomas, (C) renal cell carcinomas, and (D) in the pooled data set. The insets show the respective data in log-log plots.
|
|
Many seemingly unrelated phenomena show power law distributions, e.g., the size distributions of earthquakes measured by the Gutenberg-Richter scale (7)
, species extinctions in geological time (8)
, and commercial firm sizes (9)
. To explain some of these phenomena, Bak et al. (10)
put forward the hypothesis of SOC. According to this idea, complex systems evolve toward a state of maintained criticality at which they are able to propagate perturbations on all of the possible length and size scales, often described as avalanches. The idea that complex karyotypes are caused by a perturbation of a system that has reached a state of SOC through the accumulation of mutations is attractive from a tumorigenetic point of view, with the initiating perturbation causing the avalanche/genome rearrangements being a crucial genetic alteration e.g., telomere dysfunction (11)
. An alternative explanation may be that cells, and evolving systems in general, are prone to show HOT (12)
. In this setting, systems are tuned, through selection, to highly structured and efficiently operating states within a given environment. This is partly achieved by generating barriers that reduce the effects of cascading failures caused by perturbations encountered frequently during the evolutionary process. However, rare disturbances may lead to dramatic consequences. The results of such failures show power law distributions with exponents close to unity (12)
. In this context, the complex karyotype would be caused by a cascading failure in a system showing HOT. However, the avalanche dynamics of the systems usually associated with SOC and HOT are rapid, compared with the driving processes. That is, the system dynamics are slow but will eventually result in a state where a rare single event may, at a single time, cause a large effect i.e., a large number of aberrations. This makes the suggested models poorly compatible with the current notion of tumor development; tumor progression rather implicates growth and the successive acquisition of changes that together form a functional pathogenic system (13
, 14)
.
On the other hand, a variety of stochastic growth processes have been shown to converge to power law distributions (15, 16, 17)
. One such process is multiplicative fluctuations (18)
. In this setting each karyotype passes through a series of updating steps (tumor progression steps) at which the NAPT changes with a fraction (
) of its value, i.e., NAPT(t +1) =
xNAPT(t), where
is a positive definite random variable larger than one. In a logarithmic space this equates to log[NAPT(t +1)] = log(
)+log[NAPT(t)]. This implies that for t
the distribution of log[NAPT] approaches a uniform distribution, and transforming back to linear space gives
P[log(NAPT)]d[log(NAPT)] = C
NAPT-1dNAPT, where C represents the normalization factor. This gives P(NAPT)
NAPT-1 for the distribution of NAPT. Simulations from such a multiplicative fluctuation model do indeed show power law behavior with
close to unity (Fig. 2A)
.
An alternative process is exponential or geometrical growth that is observed in a stochastic fashion (19)
. One such geometrical growth process is preferential attachment (20)
. In the present context preferential attachment would mean that the probability to acquire a chromosomal aberration increases with the number of aberrations already present. In conjunction with tumor detection (observation), this process is expected to produce a power law distribution (19)
. Indeed, simulations using a linear increase in probability to acquire a new aberration and an exponentially distributed observation process do produce power law distributions (Fig. 2B)
.
There are at least two possible interpretations of preferential attachment in the context of karyotype evolution. One interpretation implies that the frequency by which the tumor cell is presented with new aberrations increases with the number of changes already present but that the probability to accept a given aberration is constant, and, thus, entails an increasing genomic instability. Biologically this interpretation would correspond to a mutator phenotype (21)
with an increasing mutation rate during tumor progression. The second interpretation implies that the frequency by which the cell is presented with new aberrations is constant, but that the probability that a given aberration is accepted as a part of the karyotype increases with the number of changes already present, i.e., that an increasing number of alternative changes may be accepted as a part of the karyotype. This interpretation does not necessarily include the notion of a mutator phenotype (22)
but rather an evolving and an increasingly permissive tumor microenvironment.
The two alternatives of preferential attachment both include the concept of a stepwise addition of chromosomal changes. Given this process, it would be possible to determine whether a specified aberration is early or late in the karyotypic evolution by investigating its frequency in karyotypes with different numbers of aberrations. In Fig. 3A
we have investigated the lateness of the loss of chromosome 9 (-9) in nonpapillary RCC by producing the distribution of the number of changes per tumor of the cases with this imbalance. We have used previously the modal values of such distributions as an approximation of the TO of that particular change (3, 4, 5, 6)
. By estimating the TO for all of the changes seen in a tumor type it is possible to model the temporal order by which the aberrations are acquired (3
, 6)
. To obtain more reliable estimates of TO, the original distributions were resampled 1000 times with replacement and the TO calculated after each resampling. The 25th and 75th percentiles of the bootstrapped TO values were then used to determine at which karyotypic progression steps a change is most likely to occur (Fig. 3B)
. The temporal orders established in this way have been shown to correlate well with histopathological data and are, thus, biologically relevant (4
, 5)
. TO values that overlap in Fig. 3B
signify that the respective change may occur at the same progression steps. By examining the number of alternative changes that may occur at each step in nonpapillary RCC (Fig. 3B)
it can be concluded that the increase from one to six abnormalities involve 0, 1, 3, 5, 9, and 10 alternative changes at each subsequent step. In Fig. 3C
these data have been plotted together with data obtained in a similar way for the cytogenetic pathway in papillary RCC (3)
, the major pathway operating in BC (4)
, and the two cytogenetic pathways operating in CC (5)
. It may be seen that an increasing number of alternative changes may occur at each subsequent step, supporting the concept that the frequency by which the cell is challenged with a new aberration is constant, but that the criteria for keeping any given aberration becomes less stringent when the number of imbalances increases.

View larger version (25K):
[in this window]
[in a new window]
|
Fig. 3. Temporal analyses. A, the distribution of the number of changes per tumor in nonpapillary RCCs that contain the imbalance -9. The change -9 was chosen because it occurs approximately halfway in the karyotypic evolution in nonpapillary RCC.5 From the distribution it may be seen that -9 is rarely seen in tumors with less than 4 changes, and -9 is, thus, not an early change. The modal value, and, thus, TO, is in this case 6. B, the temporal analysis of the chromosomal changes seen in the major cytogenetic pathway in nonpapillary RCC (3). The 25th and 75th percentiles are given on the X-axis, the imbalances on the Y-axis. The given imbalance occurs preferentially at tumor progression steps in the region between the 25th and 75th percentile. As the 2575th percentile interval for 3p- is very small and the TO values very low; 3p- is a distinct early event in the RCC karyotypic evolution, similarly 1p- is a distinct late event, whereas -21 does not have a distinct temporal position in the karyotypic evolution. Also, notice that the TO intervals overlap for many of the late imbalances. c, the number of alternative imbalances at each updating step in BC (4), in the two cytogenetic pathways in RCC, the papillary (RCC1) and the nonpapillary (RCC2) pathways (3), and the two cytogenetic pathways in CC, the hyperdipoid (CC1), and the hypodiploid (CC2) pathways (5). Mean, the mean number of alternative changes calculated for each tumor progression step.
|
|
The presented results point to an important aspect of carcinogenesis, the power law distribution of the number of chromosomal changes per tumor. We suggest that the process generating this distribution is based on a common mechanism as it may be seen in more than one tumor type. Furthermore, we have presented two simple stochastic models, with biologically plausible interpretations, that may account for the described scale-free behavior of the number of acquired chromosome changes in cancer.
 |
ACKNOWLEDGMENTS
|
|---|
We thank Prof. Nils Mandahl for valuable comments on this article.
 |
FOOTNOTES
|
|---|
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
1 Supported by the Swedish Cancer Society, the Crafoord Foundation, and the Nilsson Family Foundation. 
2 To whom requests for reprints should be addressed, at Department of Clinical Genetics, University Hospital, SE-221 85 Lund, Sweden. Phone: 46-26-173739; Fax: 46-46-131061; E-mail: mattias.hoglund{at}klingen.lu.se 
3 Internet address: http://cgap.nci.nih.gov/Chromosomes/Mitelman. 
4 The abbreviations used are: BC, breast cancer; CC, colorectal cancer; RCC, renal cell cancer; NAPT, number of aberrations per tumor; TO, time of occurrence; SOC, self-organized criticality; HOT, highly optimized tolerance. 
5 M. Höglund, unpublished observations. 
Received 7/ 3/03.
Revised 8/27/03.
Accepted 9/ 4/03.
 |
REFERENCES
|
|---|
- Mertens F., Johansson B., Höglund M., Mitelman F. Chromosomal imbalance maps of malignant solid tumors: a cytogenetic survey of 3185 neoplasms. Cancer Res., 57: 2765-2780, 1997.[Abstract/Free Full Text]
- Lengauer C., Kinzler K. W., Vogelstein B. Genetic instability in colorectal cancers. Nature (Lond.), 396: 643-649, 1998.[Medline]
- Höglund M., Gisselsson D., Mandahl N., Johansson B., Merterns F., Mitelman F., Säll T. Multivariate analyses of genomic imbalances in solid tumors reveal distinct and converging pathways of karyotypic evolution. Genes Chromosomes Cancer, 31: 156-171, 2001.[Medline]
- Höglund M., Gisselsson D., Hansen G. B., Säll T., Mitelman F. Multivariate analysis of chromosomal imbalances in breast cancer delineates cytogenetic pathways and reveals complex relationships among imbalances. Cancer Res., 62: 2675-2680, 2002.[Abstract/Free Full Text]
- Höglund M., Gisselsson D., Hansen G. B., Säll T., Mitelman F., Nilbert M. Dissecting karyotypic patterns in colorectal tumors: Two distinct but overlapping pathways in the adenoma-carcinoma transition. Cancer Res., 62: 5939-5946, 2002.[Abstract/Free Full Text]
- Höglund M., Gisselsson D., Säll T., Mitelman F. Coping with complexity: Multivariate analysis of tumor karyotypes. Cancer Genet. Cytogenet., 135: 103-109, 2002.[Medline]
- Gutenberg B., Richter C. F. Frequency of earthquakes in California. Bull. Seismol. Soc. Am., 34: 185-188, 1994.
- Raup D. M. Biological extinction in earth history. Science (Wash. DC), 231: 1528 1986.[Abstract/Free Full Text]
- Axtell R. L. Zipf distribution of U. S. firm sizes. Science (Wash. DC), 293: 1818-1820, 2001.
- Bak P., Tang C., Wiesenfeld K. Selforganized criticality. Phys. Rev., 38: 364-374, 1998.
- Maser R. S., DePinho R. A. Connecting chromosomes, crisis, and cancer. Science (Wash. DC), 297: 565-569, 2002.[Abstract/Free Full Text]
- Carlson J. M., Doyle J. Highly optimized tolerance: A mechanism for power laws in designed systems. Phys. Rev., E 60: 1412-1427, 1999.
- Fearon E., Vogelstein B. A genetic model for colorectal tumorigenesis. Cell, 61: 759-767, 1990.[Medline]
- Cahill D. P., Kinzler K. W., Vogelstein B., Lengauer C. Genetic instability and darwinian selection in tumours. Trends Cell. Biol., 12: M57-M60, 1999.
- Niyogi P., Berwick R. C. . A note on Zipfs Law, natural languages, and noncoding DNA, http://arxiv.org/PS cache/cmp-lg/pdf/9503/9503012.pdf 1995.
- Simon H. On a class of skew probabaility distributions. Biometrika, 42: 425-440, 1955.[Free Full Text]
- Marsili M., Zhang Y. C. Interacting individuals leading to Zipfs Law. Phys. Rev. Lett., 80: 2741-2744, 1998.
- Pietronero L., Tosatti E., Tosatti V., Vespignani A. Explaining the uneven distribution of numbers in nature: the laws of Benford and Zipf. Physica A, 293: 297-304, 2001.
- Reed W., Hughes B. From gene families and genera to incomes and Internet file sizes: why power laws are so common in nature. Phys. Rev. E Stat. Nonlin. Soft Matter Phys., 66: (62) 067103 2002.
- Albert R., Barabasi A. L. Statistical mechanics of complex networks. Rev. Mod. Phys., 74: 47-97, 2002.
- Loeb L. A., Loeb K. R., Anderson J. P. Multiple mutations and cancer. Proc. Natl. Acad. Sci. USA, 100: 776-781, 2003.[Abstract/Free Full Text]
- Tomlinson I., Bodmer W. Selection, the mutation rate and cancer: ensuring that the tail does not wag the dog. Nat. Med., 5: 11-12, 1999.[Medline]