miDruglikeness: Subdivisional Drug-Likeness Prediction Models Using Active Ensemble Learning Strategies

Biomolecules. 2022 Dec 23;13(1):29. doi: 10.3390/biom13010029.

Abstract

The drug development pipeline involves several stages including in vitro assays, in vivo assays, and clinical trials. For candidate selection, it is important to consider that a compound will successfully pass through these stages. Using graph neural networks, we developed three subdivisional models to individually predict the capacity of a compound to enter in vivo testing, clinical trials, and market approval stages. Furthermore, we proposed a strategy combing both active learning and ensemble learning to improve the quality of the models. The models achieved satisfactory performance in the internal test datasets and four self-collected external test datasets. We also employed the models as a general index to make an evaluation on a widely known benchmark dataset DEKOIS 2.0, and surprisingly found a powerful ability on virtual screening tasks. Our model system (termed as miDruglikeness) provides a comprehensive drug-likeness prediction tool for drug discovery and development.

Keywords: active learning; ensemble learning; graph neural network; subdivisional drug-likeness prediction.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Benchmarking
  • Drug Development
  • Drug Discovery*
  • Machine Learning
  • Neural Networks, Computer*