Integration of human papillomavirus (HPV) DNA into the host genome can be a driver mutation in cervical carcinoma. Identification of HPV integration at base resolution has been a longstanding technical challenge, largely due to sensitivity masking by HPV in episomes or concatenated forms. The aim was to enhance the understanding of the precise localization of HPV integration sites using an innovative strategy. Using HPV capture technology combined with next generation sequencing, HPV prevalence and the exact integration sites of the HPV DNA in 47 primary cervical cancer samples and 2 cell lines were investigated. A total of 117 unique HPV integration sites were identified, including HPV16 (n = 101), HPV18 (n = 7), and HPV58 (n = 9). We observed that the HPV16 integration sites were broadly located across the whole viral genome. In addition, either single or multiple integration events could occur frequently for HPV16, ranging from 1 to 19 per sample. The viral integration sites were distributed across almost all the chromosomes, except chromosome 22. All the cervical cancer cases harboring more than four HPV16 integration sites showed clinical diagnosis of stage III carcinoma. A significant enrichment of overlapping nucleotides shared between the human genome and HPV genome at integration breakpoints was observed, indicating that it may play an important role in the HPV integration process. The results expand on knowledge from previous findings on HPV16 and HPV18 integration sites and allow a better understanding of the molecular basis of the pathogenesis of cervical carcinoma.
Keywords: HPV capture; HPV integration; cervical carcinoma; next generation sequencing.