Geometric deep learning of protein-DNA binding specificity

Nat Methods. 2024 Sep;21(9):1674-1683. doi: 10.1038/s41592-024-02372-w. Epub 2024 Aug 5.

Abstract

Predicting protein-DNA binding specificity is a challenging yet essential task for understanding gene regulation. Protein-DNA complexes usually exhibit binding to a selected DNA target site, whereas a protein binds, with varying degrees of binding specificity, to a wide range of DNA sequences. This information is not directly accessible in a single structure. Here, to access this information, we present Deep Predictor of Binding Specificity (DeepPBS), a geometric deep-learning model designed to predict binding specificity from protein-DNA structure. DeepPBS can be applied to experimental or predicted structures. Interpretable protein heavy atom importance scores for interface residues can be extracted. When aggregated at the protein residue level, these scores are validated through mutagenesis experiments. Applied to designed proteins targeting specific DNA sequences, DeepPBS was demonstrated to predict experimentally measured binding specificity. DeepPBS offers a foundation for machine-aided studies that advance our understanding of molecular interactions and guide experimental designs and synthetic biology.

MeSH terms

  • Binding Sites
  • Computational Biology / methods
  • DNA* / chemistry
  • DNA* / metabolism
  • DNA-Binding Proteins* / chemistry
  • DNA-Binding Proteins* / metabolism
  • Deep Learning*
  • Models, Molecular
  • Protein Binding*

Substances

  • DNA
  • DNA-Binding Proteins