ReVac: a reverse vaccinology computational pipeline for prioritization of prokaryotic protein vaccine candidates

BMC Genomics. 2019 Dec 16;20(1):981. doi: 10.1186/s12864-019-6195-y.

Abstract

Background: Reverse vaccinology accelerates the discovery of potential vaccine candidates (PVCs) prior to experimental validation. Current programs typically use one bacterial proteome to identify PVCs through a filtering architecture using feature prediction programs or a machine learning approach. Filtering approaches may eliminate potential antigens based on limitations in the accuracy of prediction tools used. Machine learning approaches are heavily dependent on the selection of training datasets with experimentally validated antigens (positive control) and non-protective-antigens (negative control). The use of one or few bacterial proteomes does not assess PVC conservation among strains, an important feature of vaccine antigens.

Results: We present ReVac, which implements both a panoply of feature prediction programs without filtering out proteins, and scoring of candidates based on predictions made on curated positive and negative control PVCs datasets. ReVac surveys several genomes assessing protein conservation, as well as DNA and protein repeats, which may result in variable expression of PVCs. ReVac's orthologous clustering of conserved genes, identifies core and dispensable genome components. This is useful for determining the degree of conservation of PVCs among the population of isolates for a given pathogen. Potential vaccine candidates are then prioritized based on conservation and overall feature-based scoring. We present the application of ReVac, applied to 69 Moraxella catarrhalis and 270 non-typeable Haemophilus influenzae genomes, prioritizing 64 and 29 proteins as PVCs, respectively.

Conclusion: ReVac's use of a scoring scheme ranks PVCs for subsequent experimental testing. It employs a redundancy-based approach in its predictions of features using several prediction tools. The protein's features are collated, and each protein is ranked based on the scoring scheme. Multi-genome analyses performed in ReVac allow for a comprehensive overview of PVCs from a pan-genome perspective, as an essential pre-requisite for any bacterial subunit vaccine design. ReVac prioritized PVCs of two human respiratory pathogens, identifying both novel and previously validated PVCs.

Keywords: Antigen scoring; Bacterial; Core genome; Orthology; Pan-genome; Reverse vaccinology; Vaccines.

MeSH terms

  • Bacteria / genetics*
  • Bacteria / immunology
  • Bacterial Proteins / genetics
  • Bacterial Proteins / immunology*
  • Bacterial Vaccines / genetics
  • Bacterial Vaccines / immunology
  • Computational Biology / methods*
  • Humans
  • Maschinelles Lernen
  • Software
  • Vaccines, Subunit / genetics
  • Vaccines, Subunit / immunology
  • Vaccinology / methods*

Substances

  • Bacterial Proteins
  • Bacterial Vaccines
  • Vaccines, Subunit