Cancers of the anogenital tract as well as some head and neck cancers are caused by persistent infections with high-risk type human papillomaviruses (HPVs). Two viral oncogenes, E6 and E7, induce severe chromosomal instability associated with centrosome aberrations, anaphase bridges, chromosome lagging, and breaking. This occurs early in preneoplastic lesions, when the viral genome still persists in an episomal state. In most invasive cancers and also in a few high-grade dysplastic lesions, however, integration of high-risk HPV genomes into the host genome is observed. Integration seems to be a direct consequence of chromosomal instability and an important molecular event in the progression of preneoplastic lesions. Disruption or deregulation of defined critical cellular gene functions by insertional mutagenesis by integrated HPV genome fragments has been hypothesized as one major promoting factor in the pathogenesis of HPV-associated cancers. This hypothesis was based on the detection of HPV integration events in the area of tumor-relevant genes in few cases. Here, we reviewed >190 reported integration loci with respect to changes in the viral structure and the targeted genomic locus. This analysis confirms that HPV integration sites are randomly distributed over the whole genome with a clear predilection for genomic fragile sites. No evidence for targeted disruption or functional alteration of critical cellular genes by the integrated viral sequences could be found.
Persistent infections with high-risk (HR) human papillomaviruses (HPVs), e.g., HPV-16, HPV-18, HPV-31, HPV-33, and HPV-45 have been identified as an essential although not sufficient factor in the pathogenesis of anogenital and other epithelial carcinomas (1) HR-HPV genomes encode two proteins, E6 and E7, that interfere with important cellular control mechanisms of the cell cycle, apoptosis, and maintenance of chromosomal stability. The effects of E6 and E7 on p53 and pRB as well as on many other cellular proteins have been extensively investigated in the past and significant alterations of the regulation of the cell cycle could be attributed to the biochemical interaction of the two viral oncogenes to their respective cellular binding partners (2 , 3) . Moreover, recent studies demonstrated that the two viral oncoproteins cooperatively disturb the mechanisms of chromosome duplication and segregation during mitosis and induce thereby severe chromosomal instability (4) .
HR-HPV genomes replicate as episomal molecules in the normal viral life cycle. Although the HPV genome is consistently retained in an episomal state in early dysplastic low-grade lesions, the whole viral genome or fragments thereof are covalently integrated into the chromosomal DNA of the host cell in some advanced HPV-associated precancers and the majority of HR-HPV-associated carcinomas (5, 6, 7, 8) . These observations suggest that integration of viral genes in severe dysplastic lesions strongly enhance neoplastic progression to invasive carcinomas. A possible reason for the progression toward malignant lesions after HR-HPV integration might be structural changes of the viral genome that allow enhanced and deregulated expression of the viral oncogenes and thereby confer the additional neoplastic selective pressure. In addition to that, it has also been speculated that critical cellular genes are affected by integration of viral genome fragments and that interference of viral sequences with critical cellular sequences contributes essentially to the enhanced progression risk of HPV-induced preneoplasia into neoplastic lesions (9, 10, 11, 12) .
It was shown that HPV E6- and E7-encoding cDNAs derived from integrated viral oncogene transcripts confer a much stronger transforming capacity in primary cells as compared with cDNAs derived from episome-derived transcripts. This was attributed to the longer half-life of transcripts derived from integrated HPV DNA, mediated by 3′-cellular sequences of the fusion transcripts (13) . The relative expression levels of the viral oncogenes and their corresponding gene products appear to be directly influenced by the sequence context of individual integration sites. In addition, cis-acting regulatory sequences were shown to exert a strong influence on the expression level and regulation of the integrated viral oncogenes (14) . Additional work demonstrated that in specific cervical cancer cell lines only one or few integrated genomes are transcribed, whereas many others within the same cells are transcriptionally silenced (15) . In contrast, clinical samples harbor only few integration sites, with the majority thereof being actively transcribed (16) . Taken together, these observations suggest that integration of the viral genome renders viral gene expression independent of viral control mechanisms and allows selection of cell clones with deregulated viral oncogene expression favoring the outgrowth of neoplastic cell clones. Thus, the current evidence clearly points to an important impact of cellular sequences on the integrated viral genomes; however, it remains unclear whether the influence of viral sequences on defined cellular genes similarly contributes to the progression of HPV-induced dysplasia. Various murine, avian, and feline retroviruses can transform cells either by affecting the regulation and/or disrupting the structure of tumor suppressing or oncogenic cellular genes (transforming retroviruses; Ref. 17 ). This process of insertional mutagenesis is random, usually affects many different genomic loci, and is highly inefficient with regard to the transformation efficacy of single infection events, features that seem to be shared by oncogenic HPVs. In some cases, HPV integration has been found to occur in or close to potentially tumor relevant genes, especially within or close to the MYC gene locus (9 , 10 , 18) .
Schwarz et al. (19) identified an integrated genome copy of HPV-68 in the cervical carcinoma cell line ME180. The viral genome disrupts one allele of a novel tumor suppressor gene, APM-1. The nonaffected allele was lost in these cells, suggesting that lack of APM-1 function contributed to the pathogenesis of this particular cancer cell clone. In a recent report, Ferber et al. (11) described three cervical carcinoma cases in which integration was observed in the area of the telomerase gene. Strong up-regulation of hTERT expression was observed in one of these samples.
Many different assays have been applied to analyze genomic HPV integration sites. In situ hybridization using HPV-specific probes has given a rough estimate about the distribution of integrated HPV genomes in cell lines. Although an accumulation of integrated HPV genomes was observed at few loci, a general integration hot spot could not be identified (12 , 20, 21, 22) . Several PCR-based protocols have been developed that enable the analysis of HPV integration in clinical samples at larger scales. Direct methods to monitor integrated HPV genome copies imply that the HPV sequences are coamplified together with flanking cellular sequences using either enzyme digestion and adaptor ligation [detection of integrated papillomavirus sequences (DIPS) by ligation-mediated-PCR (23)] or religation followed by inverse PCR (24) . In another protocol, fusion regions are amplified using HPV-specific primers and primers that bind to distinct restriction enzyme recognition sites [restriction-PCR (25)] . An additional method was referred to as amplification of papillomavirus oncogene transcripts (APOT) assay (7) . Here, a modified 3′-rapid amplification of cDNA ends PCR using upstream HPV E7-specific and downstream oligo dT adaptor primers were applied to amplify HPV E7-specific transcripts either derived from integrated or episomal viral genome copies.
Up to now, in total, 192 individual HPV integration sites have been described in primary tumor samples and cell lines. Here, we summarize all available data on chromosomal HPV integration sites. The data suggest that integration of HR-HPV genomes occurs relatively late in the progression of high-grade cervical dysplasia. It appears that integration of HR-HPV genomes is a consequence of an overall destabilization process of the chromosomal integrity in replicating epithelial stem cells that express the viral E6-E7 oncogenes. The consequences of the structural alterations of the viral genome and the impact of cellular sequences on its transcriptional regulation seem to be more important than functional alteration of specific cellular genes by the integrated viral sequences.
MATERIALS AND METHODS
To collect data on chromosomal loci that are affected by integration of HR-HPV genomes, an extensive PubMed search was performed. All articles were included that presented data on either chromosomal localizations or exact nucleotide sequences of integration sites or viral cellular fusion regions. Several larger clinical studies have only looked at the HPV integration status but did neither provide locus nor sequence information (5, 6, 7, 8) . These studies were not included in the analysis
Methods used to detect chromosomal loci hit by integration of HR-HPV genomes cover methods such as fluorescence in situ hybridization and genomic and RNA library techniques to PCR-based amplification of viral cellular genomic fusions or fusion transcripts. Only integration events that could be clearly mapped to a specific human sequence were included in the study, thereby omitting many integration events into repetitive genomic areas.
Database Analysis of Integration Sites.
Where sequence data were available, a BLASTN (26) database comparison of cellular sequences with the most recent update of the human genome sequence was performed. 1 All mapping data were updated when necessary. Several integration sites were reassigned to different chromosomal bandings than those initially described.
Integration Database Internet Resource.
RESULTS AND DISCUSSION
Overview of Studies and Samples.
In total, 25 studies were included in the analysis covering 192 individual integration events (Table 1) ⇓ . Eight studies used fluorescence in situ hybridization to map integration sites, the remaining 17 studies used different PCR-based protocols to generate sequence information of the respective loci. Twenty-one integration sites were derived from cell lines, 171 from clinical samples. The majority of integration events (157) has been described for cervical lesions, mainly cervical carcinomas, but also for CIN3 (9) and CIN2 (1) lesions. Six published integration sites were discovered in vulvar, 4 in vaginal lesions, and 1 in a penile carcinoma.
Furthermore, HPV integration loci of three head and neck as well as tonsillar cancer samples have been published. The prevailing HPV type in the studies is HPV-16 with 119 integrations, followed by HPV-18 (64), HPV-45 (3) , HPV-33 (2) , HPV-6a, HPV-1, HPV-67, and HPV-68 (each one). However, these numbers do not reflect the real distribution of HR-HPV in the respective lesions because most PCR-based methods have only been established for HPV-16 and HPV-18.
Twenty-three integration sites were mapped with in situ hybridization and 169 localizations derived from PCR-based methods providing sequence information; some of these samples have additionally been analyzed with fluorescence in situ hybridization techniques. The most frequently used PCR-based methods were the amplification of papillomavirus oncogene transcripts (APOT) assay (57), the restriction PCR method (47) and the DIPS assay (32) , accounting for 136 mapped integration sites. Apart from that, PCR techniques involving enzyme digestion and religation (inverse PCR), ALU-PCR, and randomly primed PCR were used on a small number of samples as well as genomic and mRNA library techniques (Table 2) ⇓ ⇓ ⇓ .
Sequences at Integration Sites.
Several studies have looked at the exact integration site and analyzed the transition sequence between viral and cellular genome. With respect to the nucleotide sequence, all integration sites are different. Neither a specific cellular sequence motif has been observed, nor have recurrent integrations in a specific area happened at the similar nucleotides. Likewise, there is no constant disruption site in the viral genome, and the transitions from viral to cellular sequences can be found anywhere from early E1 to the late genes. Homologous recombination involving larger areas of similar nucleotide sequences does not seem to play a role in HPV integration. Geisen et al. (27) have described a human sequence that shows a mild degree of similarity to HPV E5 located on chromosome 7p13, but it has thus far not been reported to be a HPV integration target.
A sequence analysis of the fusion region between the viral and the cellular genome was possible in 40 cases (Table 2) ⇓ ⇓ ⇓ . Seven integrated HPV genomes were characterized on both fusion sides; for the remaining 33, only sequences from either the 5′- or the 3′-fusion region were available. In 27 cases, short overlapping sequences between one and six nucleotides could be found. Six samples showed a direct transition from viral to cellular sequence, and in seven other cases, filler sequences were found at the fusion site that neither derived from viral nor cellular sequences at the respective locus. Short identities in the fusion region seem to facilitate integration. In some cases, major chromosomal changes must have occurred upon integration, probably involving DNA loops that lead to the transfer of distant sequences to the integration site.
Deregulation of Cellular Genes by HPV Integration.
In several cases, HPV integration has occurred in or close to known genes, most frequently in intronic regions. Although coding regions are only rarely hit by HPV, gene expression and mRNA structure can be severely altered by insertion of the strong HPV promoter as well as additional splice donor and acceptor sites located on the HPV genome. The expression analyses of integrated HPV DNA have shown that transcribed and coding regions of genes are frequently cotranscribed with HPV E6 and E7 oncogenes.
Some of the genes disrupted by HPV integration are known to be involved in tumor development in other cancer entities, e.g., MYC, TP63, NR4A2, APM-1, FANCC, TNFAIP2, and hTERT. However, only few examples exist where a direct link between HPV integration and gene alteration was shown by functional data. In the case of the cell line ME180, HPV-68 integration was found in a novel tumor suppressor gene, APM-1 (19) . It could be shown that the unaffected allele was lost in that cell line and that APM-1 levels were reduced as compared with other cell lines. Transfection of HeLa and Caski cell lines with APM-1 led to reduced growth rates in colony-forming assays. Repeated integration in the area of a specific tumor relevant gene is rare; accumulation of integrated HPVs has been found in the greater area of the MYC locus; apart from that, integration in or close to FANCC, hTERT, and CEACAM5 has been described in two or three independent samples from different studies (10 , 11 , 18) .
The highest number of integration events was observed at 8q24, a large chromosome banding that covers ∼30 Mb and, among others, harbors the MYC gene. 8q24 integration was observed in 12 clinical samples and the cervical cancer cell line HeLa (9 , 10 , 18) . The integration sites are distributed over >500 kb around the MYC gene. Thus far, MYC expression analyses of the clinical samples that showed HPV integration in the MYC area have not been published. For HeLa, increased MYC-RNA levels were demonstrated by Northern blotting (9) . We have previously isolated a fusion transcript from HeLa encompassing viral and cellular sequences derived from the 5′-noncoding MYC region (18) . However, in contrast to Burkitt’s lymphoma, where structural aberrations of the MYC locus were clearly shown to be associated with the induction and maintenance of a malignant phenotype, this has not been demonstrated for HPV-induced carcinogenesis.
For FANCC, a transcript was identified in a clinical sample that showed the FANCC exon 6 fused to the HPV E6/E7 sequences (18) . However, there is no functional evidence that integration at this locus has a major impact on the transformation process. One of three cases with integration in the hTERT locus showed indeed strong up-regulation of hTERT transcription (11) . Albeit, given the frequently observed telomerase activation in cervical cancer independent of the integration status, one cannot exclude a coincidence of telomerase activation and HPV integration at that locus.
Taken together, this comprehensive set of data does not support the hypothesis that targeted modification of critical cellular genes plays a major role in the progression of HPV-induced preneoplasia. In contrast to the well-documented impact of E6 and E7 expression for HPV-induced transformation (28, 29, 30, 31, 32, 33) , it has not been shown in a single case that the malignant phenotype of cells relies on the potentially critical changes induced by HPV integration.
Integration of HPV DNA in Fragile Sites.
Fragile sites are genomic regions prone to chromosome breaks that facilitate foreign DNA integration. Although some specific sequence motifs were identified for rare fragile sites, common fragile sites do not seem to be linked to a specific genomic sequence and span very large genomic areas (34)
Looking at a larger scale, there seems to be an equal distribution of HPV integration sites in the human genome. All chromosomes were found to harbor integrated HPV genomes at various chromosomal bandings; however, several weakly preferred chromosomal regions were recognized, including 1q21, 2q22, 2q33, 3p21, 3p14, 3q28, 4q21, 5p15, 6p24, 8q24, 9q34, 12q13, 13q21-22, 14q24, and 17q23. They all encompass known fragile sites except for 3p21, 4q21, 5p15, 6p24, and 9q34.
Some studies have directly visualized the coincidence of fragile sites and HPV integration site (10 , 25 , 35) . In other studies, exact chromosomal localizations were compared with mapped fragile sites in the database (18) . Here, we reanalyzed all published loci for mapped fragile sites. A limitation of this approach is the rather imprecise mapping of fragile sites. Taking all data together, there is a high correlation between fragile sites and HPV integration sites. In 38% of the 192 integration sites, fragile sites are hit by HPV integration, and the number is probably much higher because some studies did not provide sufficient sequence information and not all of the fragile sites have been mapped thus far. Ten of 15 regions with at least three independent HPV integration events harbor known fragile sites, including the frequently targeted MYC locus (8q24). Matzner et al. (36) have analyzed integration of vector DNA containing a multidrug resistance gene in a breast cancer cell line under chemotherapy treatment. Here, cell clones grow out after random integration because an external gene confers the drug resistance. A significant overlap of multidrug resistance integration sites, fragile sites, and the clustered HPV integration sites (1p36, 1q21, 6q21, 9q34, 12q13 and 13q22) reviewed here was observed. In total, 62 of the 192 HPV integration loci correlated with the multidrug resistance integration loci described by Matzner et al. (36) .
In conclusion, the cause for HPV integration clustering seems to be rather related to the accessibility of these fragile genomic areas than due to a selection of clones that harbor integrated HPV in regions with tumor relevant genes.
Role of Integration in HPV-Mediated Transformation.
The progression of HPV-induced lesions toward cancer reflects a classical selection scenario in which certain events lead to the clonal outgrowth of single cells in a heterogeneous cell population (Fig. 1) ⇓ . Deregulated expression of the HPV E6 and E7 genes in epithelial stem cells leads to major chromosomal instability in the respective cells at early stages of dysplasia. This instability becomes manifest in numerical centrosome aberrations, anaphase bridges, and chromosome breaks that over time result in aneuploidy (4 , 37) . Hopman et al. (38) have analyzed DNA ploidy and HPV integration by fluorescence in situ hybridization in a number of clinical samples and observed a high correlation of aneuploid cells with integrated HPV genomes in high-grade dysplastic lesions. Obviously, HPV integration is facilitated by repair processes activated in these chromosomally unstable cells. The association of DNA repair and viral integration has also been described for retroviruses (39) . Recently, it has been shown in a large series of clinical samples that DNA aneuploidy clearly precedes HPV integration in the progression of HPV-associated cervical precancers, 3 supporting the concept that integration occurs as a result of chromosomal repair mechanisms. Accordingly, HPV integration is most frequently observed in unstable areas of the genome that are also targeted by integration of foreign DNA molecules in other scenarios. Along with integration of HPV genomes or fragments thereof that presumably occur in parallel in multiple cell clones, selection processes seem to be initiated that finally result in preferred outgrowth of only one or few cell clones with optimized expression of the HR-HPV oncogenes (16) . Finally, a malignant cell clone might emerge that accounts for the majority of the evolving tumor mass and that can also be found in local recurrences and distant metastases. 4 Therefore, the detection of HPV integration points to progressing lesions and might be applied in various clinical applications: it can be a valuable individual tumor and recurrence marker. Detection of specific integration sites in biopsies, e.g., from lymph nodes or potential distant metastases, can be used as a tumor staging tool. Posttreatment detection of residual cells with identical integration patterns as the primary tumor indicates residual disease and might significantly influence the therapeutic decision taking.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Requests for reprints: Magnus von Knebel Doeberitz, Institute of Molecular Pathology, Department of Pathology, Im Neuenheimer Feld 220, Heidelberg D-69120, Germany. Phone: 49-6221-56-28-76; Fax: 49-6221-56-59-81; E-mail:
↵1 Internet address: http://www.ncbi.nlm.nih.gov/BLAST/.
↵2 Internet address: http://www.med.uni-heidelberg.de/patho/pathomol/AG_onkogene_Papillomvirusinfektionen.html.
↵3 P. Melsheimer. DNA aneuploidy precedes integration of HPV16 E6/E7 oncogenes in intraepithelial neoplasia and invasive squamous cell carcinoma of the cervix uteri, submitted for publication.
↵4 S. Vinokurova. Clonal composition of HR-HPV induced high grade cervical dysplasia and invasive carcinomas, manuscript in preparation.
- Received January 4, 2004.
- Revision received March 4, 2004.
- Accepted March 9, 2004.
- ©2004 American Association for Cancer Research.