A D3R prospective evaluation of machine learning for protein-ligand scoring

J Comput Aided Mol Des. 2016 Sep;30(9):761-771. doi: 10.1007/s10822-016-9960-x. Epub 2016 Sep 3.

Abstract

We assess the performance of several machine learning-based scoring methods at protein-ligand pose prediction, virtual screening, and binding affinity prediction. The methods and the manner in which they were trained make them sufficiently diverse to evaluate the utility of various strategies for training set curation and binding pose generation, but they share a novel approach to classification in the context of protein-ligand scoring. Rather than explicitly using structural data such as affinity values or information extracted from crystal binding poses for training, we instead exploit the abundance of data available from high-throughput screening to approach the problem as one of discriminating binders from non-binders. We evaluate the performance of our various scoring methods in the 2015 D3R Grand Challenge and find that although the merits of some features of our approach remain inconclusive, our scoring methods performed comparably to a state-of-the-art scoring function that was fit to binding affinity data.

Keywords: D3R; Machine learning; Protein-ligand scoring; Virtual screening.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms
  • Binding Sites
  • Computational Biology / methods*
  • HSP90 Heat-Shock Proteins / chemistry
  • Humans
  • Ligands
  • Machine Learning*
  • Molecular Docking Simulation*
  • Prospective Studies
  • Protein Binding
  • Proteins / chemistry*

Substances

  • HSP90 Heat-Shock Proteins
  • Ligands
  • Proteins