Evolutionary couplings and sequence variation effect predict protein binding sites

Proteins. 2018 Oct;86(10):1064-1074. doi: 10.1002/prot.25585. Epub 2018 Oct 22.

Abstract

Binding small ligands such as ions or macromolecules such as DNA, RNA, and other proteins is one important aspect of the molecular function of proteins. Many binding sites remain without experimental annotations. Predicting binding sites on a per-residue level is challenging, but if 3D structures are known, information about coevolving residue pairs (evolutionary couplings) can predict catalytic residues through mutual information. Here, we predicted protein binding sites from evolutionary couplings derived from a global statistical model using maximum entropy. Additionally, we included information from sequence variation. A simple method using a weighted sum over eight scores substantially outperformed random (F1 = 19.3% ± 0.7% vs F1 = 2% for random). Training a neural network on these eight scores (along with predicted solvent accessibility and conservation in protein families) improved substantially (F1 = 26.2% ±0.8%). Although the machine learning was limited by the small data set and possibly wrong annotations of binding sites, the predicted binding sites formed spatial clusters in the protein. The source code of the binding site predictions is available through GitHub: https://github.com/Rostlab/bindPredict.

Keywords: binding site; coevolution; evolutionary couplings; machine learning; neural network; prediction; sequence variation.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Binding Sites
  • Biological Evolution
  • DNA / metabolism
  • DNA-Binding Proteins / chemistry
  • DNA-Binding Proteins / genetics
  • DNA-Binding Proteins / metabolism
  • Databases, Protein
  • Entropy
  • Evolution, Molecular*
  • Genetic Variation
  • Humans
  • Machine Learning
  • Models, Biological
  • Models, Molecular
  • Neural Networks, Computer
  • Protein Binding
  • Proteins / chemistry*
  • Proteins / genetics
  • Proteins / metabolism

Substances

  • DNA-Binding Proteins
  • Proteins
  • DNA