Interaction signatures of drug candidates are characteristic to off-target (neutral) and antitarget (negative) effects, inferring reduced efficiency, side-effects and high attrition rate. Today's retroactive scaled-down virtual screening (VS) experiments relying on benchmarking datasets are extensively involved to assess ligand enrichment in the real-world problem. In recent years, unbiased benchmarking sets turned into a tremendous need to assist virtual screening methodologies for emerging drug targets. To date, the benchmarking datasets are quite limited, whereas glycogen synthase kinase-3 (GSK-3) is not included into directories of benchmarking datasets such as DUD-e, MUV, etc. Herein we introduced our in-house algorithm to build an unbiased benchmarking dataset, including highly selective, moderately selective and nonselective inhibitors for a significant therapeutic target - GSK-3, suitable for both ligand-based and structure-based VS approaches. These datasets are unbiased in terms of physico-chemical properties and topological descriptors, as resulted from mean(ROC-AUC) leave-one-out cross-validation (LOO CV). and additional 2 D similarity search. Moreover, we investigated the gradual selectivity dataset by application of multiple 2 D similarity coefficients and distances, 3 D similarity and docking. Besides the resulted links between the enrichment of selective GSK-3 inhibitors and their chemical structures, a database of compounds and their 3 D similarity signatures including cut-off thresholds for enhanced selectivity was generated. 2 D similarity space analysis revealed that selectivity problem cannot be evaluated appropriately with 2 D similarity searching alone. The current analysis provided useful, comprehensive insights, which may facilitate the knowledge-based identification of novel selective GSK-3 inhibitors.Communicated by Ramaswamy H. Sarma.
Keywords: 3D similarity search; AUC; Docking; GSK-3; selective inhibitors.