Integration of molecular coarse-grained model into geometric representation learning framework for protein-protein complex property prediction

Nat Commun. 2024 Nov 7;15(1):9629. doi: 10.1038/s41467-024-53583-w.

Abstract

Structure-based machine learning algorithms have been utilized to predict the properties of protein-protein interaction (PPI) complexes, such as binding affinity, which is critical for understanding biological mechanisms and disease treatments. While most existing algorithms represent PPI complex graph structures at the atom-scale or residue-scale, these representations can be computationally expensive or may not sufficiently integrate finer chemical-plausible interaction details for improving predictions. Here, we introduce MCGLPPI, a geometric representation learning framework that combines graph neural networks (GNNs) with MARTINI molecular coarse-grained (CG) models to predict PPI overall properties accurately and efficiently. Extensive experiments on three types of downstream PPI property prediction tasks demonstrate that at the CG-scale, MCGLPPI achieves competitive performance compared with the counterparts at the atom- and residue-scale, but with only a third of computational resource consumption. Furthermore, CG-scale pre-training on protein domain-domain interaction structures enhances its predictive capabilities for PPI tasks. MCGLPPI offers an effective and efficient solution for PPI overall property predictions, serving as a promising tool for the large-scale analysis of biomolecular interactions.

MeSH terms

  • Algorithms
  • Computational Biology / methods
  • Machine Learning*
  • Models, Molecular
  • Neural Networks, Computer*
  • Protein Binding
  • Protein Interaction Mapping / methods
  • Proteins* / chemistry
  • Proteins* / metabolism

Substances

  • Proteins