Predicting drug-induced liver injury: The importance of data curation

Eleni Kotsampasakou; Floriane Montanari; Gerhard F Ecker

doi:10.1016/j.tox.2017.06.003

Predicting drug-induced liver injury: The importance of data curation

Toxicology. 2017 Aug 15:389:139-145. doi: 10.1016/j.tox.2017.06.003. Epub 2017 Jun 23.

Authors

Eleni Kotsampasakou¹, Floriane Montanari¹, Gerhard F Ecker²

Affiliations

¹ University of Vienna, Department of Pharmaceutical Chemistry, Althanstrasse 14, 1090 Vienna, Austria.
² University of Vienna, Department of Pharmaceutical Chemistry, Althanstrasse 14, 1090 Vienna, Austria. Electronic address: [email protected].

Abstract

Drug-induced liver injury (DILI) is a major issue for both patients and pharmaceutical industry due to insufficient means of prevention/prediction. In the current work we present a 2-class classification model for DILI, generated with Random Forest and 2D molecular descriptors on a dataset of 966 compounds. In addition, predicted transporter inhibition profiles were also included into the models. The initially compiled dataset of 1773 compounds was reduced via a 2-step approach to 966 compounds, resulting in a significant increase (p-value<0.05) in model performance. The models have been validated via 10-fold cross-validation and against three external test sets of 921, 341 and 96 compounds, respectively. The final model showed an accuracy of 64% (AUC 68%) for 10-fold cross-validation (average of 50 iterations) and comparable values for two test sets (AUC 59%, 71% and 66%, respectively). In the study we also examined whether the predictions of our in-house transporter inhibition models for BSEP, BCRP, P-glycoprotein, and OATP1B1 and 1B3 contributed in improvement of the DILI mode. Finally, the model was implemented with open-source 2D RDKit descriptors in order to be provided to the community as a Python script.

Keywords: 2-class classification; Data curation; Drug-induced liver injury; Liver transporters; Random Forest; Toxicity reports.

Publication types

Validation Study

MeSH terms

Algorithms
Animals
Area Under Curve
Chemical and Drug Induced Liver Injury / etiology*
Chemical and Drug Induced Liver Injury / metabolism
Chemical and Drug Induced Liver Injury / pathology
Computer Simulation*
Data Curation*
Data Mining
Databases, Factual
Humans
Liver / drug effects*
Liver / metabolism
Liver / pathology
Membrane Transport Proteins / drug effects*
Membrane Transport Proteins / metabolism
Models, Statistical*
Reproducibility of Results
Risk Assessment
Toxicity Tests / methods*

Substances

Membrane Transport Proteins

Grants and funding

F 3502/FWF_/Austrian Science Fund FWF/Austria