Predicting Off-Target Binding Profiles With Confidence Using Conformal Prediction

Samuel Lampa; Jonathan Alvarsson; Staffan Arvidsson Mc Shane; Arvid Berg; Ernst Ahlberg; Ola Spjuth

doi:10.3389/fphar.2018.01256

Predicting Off-Target Binding Profiles With Confidence Using Conformal Prediction

Front Pharmacol. 2018 Nov 6:9:1256. doi: 10.3389/fphar.2018.01256. eCollection 2018.

Authors

Samuel Lampa¹, Jonathan Alvarsson¹, Staffan Arvidsson Mc Shane¹, Arvid Berg¹, Ernst Ahlberg², Ola Spjuth¹

Affiliations

¹ Pharmaceutical Bioinformatics Group, Department of Pharmaceutical Biosciences, Uppsala University, Uppsala, Sweden.
² Predictive Compound ADME and Safety, Drug Safety and Metabolism, AstraZeneca IMED Biotech Unit, Mölndal, Sweden.

Abstract

Ligand-based models can be used in drug discovery to obtain an early indication of potential off-target interactions that could be linked to adverse effects. Another application is to combine such models into a panel, allowing to compare and search for compounds with similar profiles. Most contemporary methods and implementations however lack valid measures of confidence in their predictions, and only provide point predictions. We here describe a methodology that uses Conformal Prediction for predicting off-target interactions, with models trained on data from 31 targets in the ExCAPE-DB dataset selected for their utility in broad early hazard assessment. Chemicals were represented by the signature molecular descriptor and support vector machines were used as the underlying machine learning method. By using conformal prediction, the results from predictions come in the form of confidence p-values for each class. The full pre-processing and model training process is openly available as scientific workflows on GitHub, rendering it fully reproducible. We illustrate the usefulness of the developed methodology on a set of compounds extracted from DrugBank. The resulting models are published online and are available via a graphical web interface and an OpenAPI interface for programmatic access.

Keywords: adverse effects; conformal prediction; machine learning; off-target; predictive modeling; target profiles; workflow.