A virome-wide clonal integration analysis platform for discovering cancer viral etiology

Genome Res. 2019 May;29(5):819-830. doi: 10.1101/gr.242529.118. Epub 2019 Mar 14.

Abstract

Oncoviral infection is responsible for 12%-15% of cancer in humans. Convergent evidence from epidemiology, pathology, and oncology suggests that new viral etiologies for cancers remain to be discovered. Oncoviral profiles can be obtained from cancer genome sequencing data; however, widespread viral sequence contamination and noncausal viruses complicate the process of identifying genuine oncoviruses. Here, we propose a novel strategy to address these challenges by performing virome-wide screening of early-stage clonal viral integrations. To implement this strategy, we developed VIcaller, a novel platform for identifying viral integrations that are derived from any characterized viruses and shared by a large proportion of tumor cells using whole-genome sequencing (WGS) data. The sensitivity and precision were confirmed with simulated and benchmark cancer data sets. By applying this platform to cancer WGS data sets with proven or speculated viral etiology, we newly identified or confirmed clonal integrations of hepatitis B virus (HBV), human papillomavirus (HPV), Epstein-Barr virus (EBV), and BK Virus (BKV), suggesting the involvement of these viruses in early stages of tumorigenesis in affected tumors, such as HBV in TERT and KMT2B (also known as MLL4) gene loci in liver cancer, HPV and BKV in bladder cancer, and EBV in non-Hodgkin's lymphoma. We also showed the capacity of VIcaller to identify integrations from some uncharacterized viruses. This is the first study to systematically investigate the strategy and method of virome-wide screening of clonal integrations to identify oncoviruses. Searching clonal viral integrations with our platform has the capacity to identify virus-caused cancers and discover cancer viral etiologies.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • BK Virus / genetics
  • BK Virus / pathogenicity
  • Carcinogenesis / genetics
  • Cell Transformation, Neoplastic
  • DNA, Viral
  • DNA-Binding Proteins / genetics
  • Hepatitis B virus / genetics
  • Hepatitis B virus / pathogenicity
  • Herpesvirus 4, Human / genetics
  • Herpesvirus 4, Human / pathogenicity
  • Histone-Lysine N-Methyltransferase
  • Humans
  • Liver Neoplasms / genetics
  • Liver Neoplasms / virology
  • Lymphoma, Non-Hodgkin / genetics
  • Lymphoma, Non-Hodgkin / virology
  • Neoplasms / genetics
  • Neoplasms / virology*
  • Papillomaviridae / genetics
  • Papillomaviridae / pathogenicity
  • Software
  • Urinary Bladder Neoplasms / genetics
  • Urinary Bladder Neoplasms / virology
  • Virus Integration / genetics*
  • Whole Genome Sequencing*

Substances

  • DNA, Viral
  • DNA-Binding Proteins
  • Histone-Lysine N-Methyltransferase
  • MLL4 protein, human