Evaluation of somatic copy number variation detection by NGS technologies and bioinformatics tools on a hyper-diploid cancer genome

Genome Biol. 2024 Jun 20;25(1):163. doi: 10.1186/s13059-024-03294-8.

Abstract

Background: Copy number variation (CNV) is a key genetic characteristic for cancer diagnostics and can be used as a biomarker for the selection of therapeutic treatments. Using data sets established in our previous study, we benchmark the performance of cancer CNV calling by six most recent and commonly used software tools on their detection accuracy, sensitivity, and reproducibility. In comparison to other orthogonal methods, such as microarray and Bionano, we also explore the consistency of CNV calling across different technologies on a challenging genome.

Results: While consistent results are observed for copy gain, loss, and loss of heterozygosity (LOH) calls across sequencing centers, CNV callers, and different technologies, variation of CNV calls are mostly affected by the determination of genome ploidy. Using consensus results from six CNV callers and confirmation from three orthogonal methods, we establish a high confident CNV call set for the reference cancer cell line (HCC1395).

Conclusions: NGS technologies and current bioinformatics tools can offer reliable results for detection of copy gain, loss, and LOH. However, when working with a hyper-diploid genome, some software tools can call excessive copy gain or loss due to inaccurate assessment of genome ploidy. With performance matrices on various experimental conditions, this study raises awareness within the cancer research community for the selection of sequencing platforms, sample preparation, sequencing coverage, and the choice of CNV detection tools.

Keywords: Accuracy; Bioinformatics tools; Cancer genome; Consistency; Copy number variation; Detection sensitivity; Genome ploidy; Next-generation sequencing; Reproducibility.

Publication types

  • Evaluation Study
  • Research Support, N.I.H., Intramural
  • Research Support, U.S. Gov't, Non-P.H.S.
  • Research Support, Non-U.S. Gov't
  • Research Support, N.I.H., Extramural

MeSH terms

  • Cell Line, Tumor
  • Computational Biology* / methods
  • DNA Copy Number Variations*
  • Diploidy
  • Genome, Human
  • High-Throughput Nucleotide Sequencing* / methods
  • Humans
  • Loss of Heterozygosity*
  • Neoplasms* / genetics
  • Reproducibility of Results
  • Sequence Analysis, DNA / methods
  • Software*