A deep learning-based method enables the automatic and accurate assembly of chromosome-level genomes

Nucleic Acids Res. 2024 Oct 28;52(19):e92. doi: 10.1093/nar/gkae789.

Abstract

The application of high-throughput chromosome conformation capture (Hi-C) technology enables the construction of chromosome-level assemblies. However, the correction of errors and the anchoring of sequences to chromosomes in the assembly remain significant challenges. In this study, we developed a deep learning-based method, AutoHiC, to address the challenges in chromosome-level genome assembly by enhancing contiguity and accuracy. Conventional Hi-C-aided scaffolding often requires manual refinement, but AutoHiC instead utilizes Hi-C data for automated workflows and iterative error correction. When trained on data from 300+ species, AutoHiC demonstrated a robust average error detection accuracy exceeding 90%. The benchmarking results confirmed its significant impact on genome contiguity and error correction. The innovative approach and comprehensive results of AutoHiC constitute a breakthrough in automated error detection, promising more accurate genome assemblies for advancing genomics research.

MeSH terms

  • Animals
  • Chromosomes* / genetics
  • Deep Learning*
  • Genome / genetics
  • Genomics* / methods
  • High-Throughput Nucleotide Sequencing / methods
  • Humans
  • Software