Dual specificity protein kinase threonine/Tyrosine kinase (TTK) is one of the mitotic kinases. High levels of TTK are detected in several types of cancer. Hence, TTK inhibition is considered a promising therapeutic anti-cancer strategy. In this work, we used multiple docked poses of TTK inhibitors to augment training data for machine learning QSAR modeling. Ligand-Receptor Contacts Fingerprints and docking scoring values were used as descriptor variables. Escalating docking-scoring consensus levels were scanned against orthogonal machine learners, and the best learners (Random Forests and XGBoost) were coupled with genetic algorithm and Shapley additive explanations (SHAP) to determine critical descriptors for predicting anti-TTK bioactivity and for pharmacophore generation. Three successful pharmacophores were deduced and subsequently used for in silico screening against the NCI database. A total of 14 hits were evaluated in vitro for their anti-TTK bioactivities. One hit of novel chemotype showed reasonable dose-response curve with experimental IC50 of 1.0 μM. The presented work indicates the validity of data augmentation using multiple docked poses for building successful machine learning models and pharmacophore hypotheses.
Keywords: QSAR; Shapley values; TTK; docking; machine learning; scoring.
© 2023 Wiley-VCH GmbH.