Genome Analysis of a Newly Sequenced B. subtilis SRCM117797 and Multiple Public B. subtilis Genomes Unveils Insights into Strain Diversification and Biased Core Gene Distribution

Curr Microbiol. 2024 Aug 12;81(10):305. doi: 10.1007/s00284-024-03819-1.

Abstract

The bacterium Bacillus subtilis is a widely used study model and industrial workhorse organism that belongs to the group of gram-positive bacteria. In this study, we report the analysis of a newly sequenced complete genome of B. subtilis strain SRCM117797 along with a comparative genomics of a large collection of B. subtilis strain genomes. B. subtilis strain SRCM117797 has 4,255,638 bp long chromosome with 43.4% GC content and high coding sequence association with macromolecules, metabolism, and phage genes. Genomic diversity analysis of 232 B. subtilis strains resulted in the identification of eight clusters and three singletons. Of 147 B. subtilis strains included, 89.12% had strain-specific genes, of which 6.75% encoded strain-specific insertion sequence family transposases. Our analysis showed a potential role of strain-specific insertion sequence family transposases in intra-cellular accumulation of strain-specific genes. Furthermore, the chromosomal layout of the core genes was biased: overrepresented on the upper half (closer to the origin of replication) of the chromosome, which may explain the fast-growing characteristics of B. subtilis. Overall, the study provides a complete genome sequence of B. subtilis strain SRCM117797, show an extensive genomic diversity of B. subtilis strains and insights into strain diversification mechanism and non-random chromosomal layout of core genes.

MeSH terms

  • Bacillus subtilis* / genetics
  • Base Composition
  • Chromosomes, Bacterial / genetics
  • Genetic Variation
  • Genome, Bacterial*
  • Genomics
  • Phylogeny
  • Sequence Analysis, DNA