Machine learning estimates of natural product conformational energies

PLoS Comput Biol. 2014 Jan;10(1):e1003400. doi: 10.1371/journal.pcbi.1003400. Epub 2014 Jan 16.

Abstract

Machine learning has been used for estimation of potential energy surfaces to speed up molecular dynamics simulations of small systems. We demonstrate that this approach is feasible for significantly larger, structurally complex molecules, taking the natural product Archazolid A, a potent inhibitor of vacuolar-type ATPase, from the myxobacterium Archangium gephyra as an example. Our model estimates energies of new conformations by exploiting information from previous calculations via Gaussian process regression. Predictive variance is used to assess whether a conformation is in the interpolation region, allowing a controlled trade-off between prediction accuracy and computational speed-up. For energies of relaxed conformations at the density functional level of theory (implicit solvent, DFT/BLYP-disp3/def2-TZVP), mean absolute errors of less than 1 kcal/mol were achieved. The study demonstrates that predictive machine learning models can be developed for structurally complex, pharmaceutically relevant compounds, potentially enabling considerable speed-ups in simulations of larger molecular structures.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adenosine Triphosphatases / chemistry
  • Algorithms
  • Artificial Intelligence*
  • Chemistry, Pharmaceutical
  • Computational Biology / methods
  • Enzyme Inhibitors / chemistry*
  • Macrolides / chemistry*
  • Magnetic Resonance Spectroscopy
  • Models, Chemical
  • Molecular Dynamics Simulation
  • Molecular Structure
  • Myxococcales / metabolism
  • Normal Distribution
  • Principal Component Analysis
  • Protein Conformation
  • Software
  • Stochastic Processes
  • Thiazoles / chemistry*

Substances

  • Enzyme Inhibitors
  • Macrolides
  • Thiazoles
  • archazolid A
  • Adenosine Triphosphatases

Grants and funding

This research was supported by the Swiss National Science Foundation (grant no. 205321-134783), the Deutsche Forschungsgemeinschaft (DFG, FOR1406TP4), and the FP7 programme of the European Community (Marie Curie IEF 273039). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.