Population structure, biogeography and transmissibility of Mycobacterium tuberculosis

Nat Commun. 2021 Oct 20;12(1):6099. doi: 10.1038/s41467-021-26248-1.

Abstract

Mycobacterium tuberculosis is a clonal pathogen proposed to have co-evolved with its human host for millennia, yet our understanding of its genomic diversity and biogeography remains incomplete. Here we use a combination of phylogenetics and dimensionality reduction to reevaluate the population structure of M. tuberculosis, providing an in-depth analysis of the ancient Indo-Oceanic Lineage 1 and the modern Central Asian Lineage 3, and expanding our understanding of Lineages 2 and 4. We assess sub-lineages using genomic sequences from 4939 pan-susceptible strains, and find 30 new genetically distinct clades that we validate in a dataset of 4645 independent isolates. We find a consistent geographically restricted or unrestricted pattern for 20 groups, including three groups of Lineage 1. The distribution of terminal branch lengths across the M. tuberculosis phylogeny supports the hypothesis of a higher transmissibility of Lineages 2 and 4, in comparison with Lineages 3 and 1, on a global scale. We define an expanded barcode of 95 single nucleotide substitutions that allows rapid identification of 69 M. tuberculosis sub-lineages and 26 additional internal groups. Our results paint a higher resolution picture of the M. tuberculosis phylogeny and biogeography.

MeSH terms

  • DNA Barcoding, Taxonomic
  • Evolution, Molecular
  • Genome, Bacterial / genetics
  • Humans
  • Mycobacterium tuberculosis / classification*
  • Mycobacterium tuberculosis / genetics
  • Mycobacterium tuberculosis / isolation & purification
  • Phylogeny*
  • Phylogeography
  • Polymorphism, Single Nucleotide
  • Software
  • Tuberculosis / microbiology
  • Tuberculosis / transmission*