refMLST: reference-based multilocus sequence typing enables universal bacterial typing

BMC Bioinformatics. 2024 Aug 27;25(1):280. doi: 10.1186/s12859-024-05913-4.

Abstract

Background: Commonly used approaches for genomic investigation of bacterial outbreaks, including SNP and gene-by-gene approaches, are limited by the requirement for background genomes and curated allele schemes, respectively. As a result, they only work on a select subset of known organisms, and fail on novel or less studied pathogens. We introduce refMLST, a gene-by-gene approach using the reference genome of a bacterium to form a scalable, reproducible and robust method to perform outbreak investigation.

Results: When applied to multiple outbreak causing bacteria including 1263 Salmonella enterica, 331 Yersinia enterocolitica and 6526 Campylobacter jejuni genomes, refMLST enabled consistent clustering, improved resolution, and faster processing in comparison to commonly used tools like chewieSnake.

Conclusions: refMLST is a novel multilocus sequence typing approach that is applicable to any bacterial species with a public reference genome, does not require a curated scheme, and automatically accounts for genetic recombination.

Availability and implementation: refMLST is freely available for academic use at https://bugseq.com/academic .

Keywords: Epidemiology; Genomic; Multilocus sequence typing; Reference genome.

MeSH terms

  • Bacterial Typing Techniques* / methods
  • Campylobacter jejuni / classification
  • Campylobacter jejuni / genetics
  • Disease Outbreaks
  • Genome, Bacterial / genetics
  • Multilocus Sequence Typing* / methods
  • Salmonella enterica / classification
  • Salmonella enterica / genetics
  • Software
  • Yersinia enterocolitica / classification
  • Yersinia enterocolitica / genetics