Accurate predictions of SARS-CoV-2 infectivity from comprehensive analysis

Elife. 2024 Dec 24:13:RP99833. doi: 10.7554/eLife.99833.

Abstract

An unprecedented amount of SARS-CoV-2 data has been accumulated compared with previous infectious diseases, enabling insights into its evolutionary process and more thorough analyses. This study investigates SARS-CoV-2 features as it evolved to evaluate its infectivity. We examined viral sequences and identified the polarity of amino acids in the receptor binding motif (RBM) region. We detected an increased frequency of amino acid substitutions to lysine (K) and arginine (R) in variants of concern (VOCs). As the virus evolved to Omicron, commonly occurring mutations became fixed components of the new viral sequence. Furthermore, at specific positions of VOCs, only one type of amino acid substitution and a notable absence of mutations at D467 were detected. We found that the binding affinity of SARS-CoV-2 lineages to the ACE2 receptor was impacted by amino acid substitutions. Based on our discoveries, we developed APESS, an evaluation model evaluating infectivity from biochemical and mutational properties. In silico evaluation using real-world sequences and in vitro viral entry assays validated the accuracy of APESS and our discoveries. Using Machine Learning, we predicted mutations that had the potential to become more prominent. We created AIVE, a web-based system, accessible at https://ai-ve.org to provide infectivity measurements of mutations entered by users. Ultimately, we established a clear link between specific viral properties and increased infectivity, enhancing our understanding of SARS-CoV-2 and enabling more accurate predictions of the virus.

Keywords: SARS-CoV-2; genetics; genomics; infectivity; protein prediction; viruses.

MeSH terms

  • Amino Acid Substitution* / genetics
  • Angiotensin-Converting Enzyme 2* / chemistry
  • Angiotensin-Converting Enzyme 2* / genetics
  • Angiotensin-Converting Enzyme 2* / metabolism
  • COVID-19* / virology
  • Humans
  • Machine Learning
  • Mutation
  • Protein Binding
  • SARS-CoV-2* / genetics
  • SARS-CoV-2* / pathogenicity
  • Spike Glycoprotein, Coronavirus / chemistry
  • Spike Glycoprotein, Coronavirus / genetics
  • Spike Glycoprotein, Coronavirus / metabolism
  • Virus Internalization

Substances

  • Angiotensin-Converting Enzyme 2
  • Spike Glycoprotein, Coronavirus
  • ACE2 protein, human
  • spike protein, SARS-CoV-2

Supplementary concepts

  • SARS-CoV-2 variants