VirDetect-AI: a residual and convolutional neural network-based metagenomic tool for eukaryotic viral protein identification

Brief Bioinform. 2024 Nov 22;26(1):bbaf001. doi: 10.1093/bib/bbaf001.

Abstract

This study addresses the challenging task of identifying viruses within metagenomic data, which encompasses a broad array of biological samples, including animal reservoirs, environmental sources, and the human body. Traditional methods for virus identification often face limitations due to the diversity and rapid evolution of viral genomes. In response, recent efforts have focused on leveraging artificial intelligence (AI) techniques to enhance accuracy and efficiency in virus detection. However, existing AI-based approaches are primarily binary classifiers, lacking specificity in identifying viral types and reliant on nucleotide sequences. To address these limitations, VirDetect-AI, a novel tool specifically designed for the identification of eukaryotic viruses within metagenomic datasets, is introduced. The VirDetect-AI model employs a combination of convolutional neural networks and residual neural networks to effectively extract hierarchical features and detailed patterns from complex amino acid genomic data. The results demonstrated that the model has outstanding results in all metrics, with a sensitivity of 0.97, a precision of 0.98, and an F1-score of 0.98. VirDetect-AI improves our comprehension of viral ecology and can accurately classify metagenomic sequences into 980 viral protein classes, hence enabling the identification of new viruses. These classes encompass an extensive array of viral genera and families, as well as protein functions and hosts.

Keywords: VirDetect-AI; convolutional neural networks; deep learning; eukaryotic virus identification; residual neural networks; viral metagenomic.

MeSH terms

  • Algorithms
  • Animals
  • Artificial Intelligence
  • Eukaryota / genetics
  • Genome, Viral
  • Humans
  • Metagenome
  • Metagenomics* / methods
  • Neural Networks, Computer*
  • Viral Proteins* / genetics
  • Viruses / classification
  • Viruses / genetics

Substances

  • Viral Proteins