Removal of sequencing adapter contamination improves microbial genome databases

BMC Genomics. 2024 Nov 4;25(1):1033. doi: 10.1186/s12864-024-10956-1.

Abstract

Advances in assembling microbial genomes have led to growth of reference genome databases, which have been transformative for applied and basic microbiome research. Here we show that published microbial genome databases from humans, mice, cows, pigs, fish, honeybees, and marine environments contain significant sequencing-adapter contamination that systematically reduces assembly accuracy and contiguousness. By removing the adapter-contaminated ends of contiguous sequences and reassembling MGnify reference genomes, we improve the quality of assemblies in these databases.

Keywords: Bacterial genomes; Metagenome; Microbial ecology; Microbiota; Symbiosis.

MeSH terms

  • Animals
  • Cattle
  • DNA Contamination
  • Databases, Genetic*
  • Genome, Microbial
  • High-Throughput Nucleotide Sequencing / methods
  • Humans
  • Mice
  • Sequence Analysis, DNA / methods
  • Swine