Global Analysis of Deep Learning Prediction Using Large-Scale In-House Kinome-Wide Profiling Data

ACS Omega. 2022 May 23;7(22):18374-18381. doi: 10.1021/acsomega.2c00664. eCollection 2022 Jun 7.

Abstract

In drug discovery, the prediction of activity and absorption, distribution, metabolism, excretion, and toxicity parameters is one of the most important approaches in determining which compound to synthesize next. In recent years, prediction methods based on deep learning as well as non-deep learning approaches have been established, and a number of applications to drug discovery have been reported by various companies and organizations. In this research, we performed activity prediction using deep learning and non-deep learning methods on in-house assay data for several hundred kinases and compared and discussed the prediction results. We found that the prediction accuracy of the single-task graph neural network (GNN) model was generally lower than that of the non-deep learning model (LightGBM), but the multitask GNN model, which combined data from other kinases, comprehensively outperformed LightGBM. In addition, the extrapolative validity of the multitask model was verified by using it for prediction on known kinase ligands. We observed an overlap between characteristic protein-ligand interaction sites and the atoms that are important for prediction. By building appropriate models based on the conditions of the data set and analyzing the feature importance of the prediction results, a ligand-based prediction method may be used not only for activity prediction but also for drug design.