Exploring isofunctional molecules: Design of a benchmark and evaluation of prediction performance

Mol Inform. 2023 Apr;42(4):e2200216. doi: 10.1002/minf.202200216. Epub 2023 Feb 17.

Abstract

Identification of novel chemotypes with biological activity similar to a known active molecule is an important challenge in drug discovery called 'scaffold hopping'. Small-, medium-, and large-step scaffold hopping efforts may lead to increasing degrees of chemical structure novelty with respect to the parent compound. In the present paper, we focus on the problem of large-step scaffold hopping. We assembled a high quality and well characterized dataset of scaffold hopping examples comprising pairs of active molecules and including a variety of protein targets. This dataset was used to build a benchmark corresponding to the setting of real-life applications: one active molecule is known, and the second active is searched among a set of decoys chosen in a way to avoid statistical bias. This allowed us to evaluate the performance of computational methods for solving large-step scaffold hopping problems. In particular, we assessed how difficult these problems are, particularly for classical 2D and 3D ligand-based methods. We also showed that a machine-learning chemogenomic algorithm outperforms classical methods and we provided some useful hints for future improvements.

Keywords: benchmark; chemogenomics; ligand-based; molecular interactions; scaffold hopping.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Drug Discovery* / methods
  • Ligands
  • Machine Learning

Substances

  • Ligands