First gene-ontology enrichment analysis based on bacterial coregenome variants: insights into adaptations of Salmonella serovars to mammalian- and avian-hosts

BMC Microbiol. 2017 Nov 28;17(1):222. doi: 10.1186/s12866-017-1132-1.

Abstract

Background: Many of the bacterial genomic studies exploring evolution processes of the host adaptation focus on the accessory genome describing how the gains and losses of genes can explain the colonization of new habitats. Consequently, we developed a new approach focusing on the coregenome in order to describe the host adaptation of Salmonella serovars.

Methods: In the present work, we propose bioinformatic tools allowing (i) robust phylogenetic inference based on SNPs and recombination events, (ii) identification of fixed SNPs and InDels distinguishing homoplastic and non-homoplastic coregenome variants, and (iii) gene-ontology enrichment analyses to describe metabolic processes involved in adaptation of Salmonella enterica subsp. enterica to mammalian- (S. Dublin), multi- (S. Enteritidis), and avian- (S. Pullorum and S. Gallinarum) hosts.

Results: The 'VARCall' workflow produced a robust phylogenetic inference confirming that the monophyletic clade S. Dublin diverged from the polyphyletic clade S. Enteritidis which includes the divergent clades S. Pullorum and S. Gallinarum (i). The scripts 'phyloFixedVar' and 'FixedVar' detected non-synonymous and non-homoplastic fixed variants supporting the phylogenetic reconstruction (ii). The scripts 'GetGOxML' and 'EveryGO' identified representative metabolic pathways related to host adaptation using the first gene-ontology enrichment analysis based on bacterial coregenome variants (iii).

Conclusions: We propose in the present manuscript a new coregenome approach coupling identification of fixed SNPs and InDels with regards to inferred phylogenetic clades, and gene-ontology enrichment analysis in order to describe the adaptation of Salmonella serovars Dublin (i.e. mammalian-hosts), Enteritidis (i.e. multi-hosts), Pullorum (i.e. avian-hosts) and Gallinarum (i.e. avian-hosts) at the coregenome scale. All these polyvalent Bioinformatic tools can be applied on other bacterial genus without additional developments.

Keywords: Bacterial fixed variants; Bacterial genomics; Gene-ontology enrichment analysis.

MeSH terms

  • Adaptation, Physiological / genetics*
  • Animals
  • Birds / microbiology*
  • Birds / physiology
  • Evolution, Molecular
  • Gene Ontology
  • Genome, Bacterial / genetics*
  • Host Specificity
  • INDEL Mutation
  • Mammals / microbiology*
  • Mammals / physiology
  • Phylogeny*
  • Polymorphism, Single Nucleotide
  • Recombination, Genetic
  • Salmonella / classification*
  • Salmonella / genetics*
  • Salmonella / physiology
  • Serogroup

Grants and funding