Protein-protein and protein-nucleic acid binding site prediction via interpretable hierarchical geometric deep learning

Gigascience. 2024 Jan 2:13:giae080. doi: 10.1093/gigascience/giae080.

Abstract

Identification of protein-protein and protein-nucleic acid binding sites provides insights into biological processes related to protein functions and technical guidance for disease diagnosis and drug design. However, accurate predictions by computational approaches remain highly challenging due to the limited knowledge of residue binding patterns. The binding pattern of a residue should be characterized by the spatial distribution of its neighboring residues combined with their physicochemical information interaction, which yet cannot be achieved by previous methods. Here, we design GraphRBF, a hierarchical geometric deep learning model to learn residue binding patterns from big data. To achieve it, GraphRBF describes physicochemical information interactions by designing an enhanced graph neural network and characterizes residue spatial distributions by introducing a prioritized radial basis function neural network. After training and testing, GraphRBF shows great improvements over existing state-of-the-art methods and strong interpretability of its learned representations. Applying GraphRBF to the SARS-CoV-2 omicron spike protein, it successfully identifies known epitopes of the protein. Moreover, it predicts multiple potential binding regions for new nanobodies or even new drugs with strong evidence. A user-friendly online server for GraphRBF is freely available at http://liulab.top/GraphRBF/server.

Keywords: enhanced graph neural network; prioritized radial basis function neural network; protein binding sites; residue binding patterns.

MeSH terms

  • Binding Sites
  • COVID-19 / metabolism
  • COVID-19 / virology
  • Computational Biology / methods
  • Deep Learning*
  • Humans
  • Neural Networks, Computer
  • Nucleic Acids / chemistry
  • Nucleic Acids / metabolism
  • Protein Binding*
  • SARS-CoV-2* / metabolism
  • Spike Glycoprotein, Coronavirus* / chemistry
  • Spike Glycoprotein, Coronavirus* / metabolism

Substances

  • Spike Glycoprotein, Coronavirus
  • spike protein, SARS-CoV-2
  • Nucleic Acids