The development of a new vaccine is a challenging exercise involving several steps including computational studies, experimental work, and animal studies followed by clinical studies. To accelerate the process, in silico screening is frequently used for antigen identification. Here, we present Vaxi-DL, web-based deep learning (DL) software that evaluates the potential of protein sequences to serve as vaccine target antigens. Four different DL pathogen models were trained to predict target antigens in bacteria, protozoa, fungi, and viruses that cause infectious diseases in humans. Datasets containing antigenic and non-antigenic sequences were derived from known vaccine candidates and the Protegen database. Biological and physicochemical properties were computed for the datasets using publicly available bioinformatics tools. For each of the four pathogen models, the datasets were divided into training, validation, and testing subsets and then scaled and normalised. The models were constructed using Fully Connected Layers (FCLs), hyper-tuned, and trained using the training subset. Accuracy, sensitivity, specificity, precision, recall, and AUC (Area under the Curve) were used as metrics to assess the performance of these models. The models were benchmarked using independent datasets of known target antigens against other prediction tools such as VaxiJen and Vaxign-ML. We also tested Vaxi-DL on 219 known potential vaccine candidates (PVC) from 37 different pathogens. Our tool predicted 175 PVCs correctly out of 219 sequences. We also tested Vaxi-DL on different datasets obtained from multiple resources. Our tool has demonstrated an average sensitivity of 93% and will thus be a useful tool for prioritising PVCs for preclinical studies.
Keywords: Antigen prediction; Artificial intelligence; COVID-19; Coronavirus; Deep learning; In silico vaccine development; Machine learning; SARS-CoV-2; Vaccine; Vaccine design; Vaxi-DL server; mRNA vaccines.
Copyright © 2022 Elsevier Ltd. All rights reserved.