Augmenting bioactivity by docking-generated multiple ligand poses to enhance machine learning and pharmacophore modelling: discovery of new TTK inhibitors as case study

Mol Inform. 2023 Jun;42(6):e2300022. doi: 10.1002/minf.202300022. Epub 2023 Jun 7.

Abstract

Dual specificity protein kinase threonine/Tyrosine kinase (TTK) is one of the mitotic kinases. High levels of TTK are detected in several types of cancer. Hence, TTK inhibition is considered a promising therapeutic anti-cancer strategy. In this work, we used multiple docked poses of TTK inhibitors to augment training data for machine learning QSAR modeling. Ligand-Receptor Contacts Fingerprints and docking scoring values were used as descriptor variables. Escalating docking-scoring consensus levels were scanned against orthogonal machine learners, and the best learners (Random Forests and XGBoost) were coupled with genetic algorithm and Shapley additive explanations (SHAP) to determine critical descriptors for predicting anti-TTK bioactivity and for pharmacophore generation. Three successful pharmacophores were deduced and subsequently used for in silico screening against the NCI database. A total of 14 hits were evaluated in vitro for their anti-TTK bioactivities. One hit of novel chemotype showed reasonable dose-response curve with experimental IC50 of 1.0 μM. The presented work indicates the validity of data augmentation using multiple docked poses for building successful machine learning models and pharmacophore hypotheses.

Keywords: QSAR; Shapley values; TTK; docking; machine learning; scoring.

MeSH terms

  • Humans
  • Ligands
  • Machine Learning
  • Neoplasms*
  • Pharmacophore*

Substances

  • Ligands

Grants and funding