Application of machine learning models for property prediction to targeted protein degraders

Nat Commun. 2024 Jul 9;15(1):5764. doi: 10.1038/s41467-024-49979-3.

Abstract

Machine learning (ML) systems can model quantitative structure-property relationships (QSPR) using existing experimental data and make property predictions for new molecules. With the advent of modalities such as targeted protein degraders (TPD), the applicability of QSPR models is questioned and ML usage in TPD-centric projects remains limited. Herein, ML models are developed and evaluated for TPDs' property predictions, including passive permeability, metabolic clearance, cytochrome P450 inhibition, plasma protein binding, and lipophilicity. Interestingly, performance on TPDs is comparable to that of other modalities. Predictions for glues and heterobifunctionals often yield lower and higher errors, respectively. For permeability, CYP3A4 inhibition, and human and rat microsomal clearance, misclassification errors into high and low risk categories are lower than 4% for glues and 15% for heterobifunctionals. For all modalities, misclassification errors range from 0.8% to 8.1%. Investigated transfer learning strategies improve predictions for heterobifunctionals. This is the first comprehensive evaluation of ML for the prediction of absorption, distribution, metabolism, and excretion (ADME) and physicochemical properties of TPD molecules, including heterobifunctional and molecular glue sub-modalities. Taken together, our investigations show that ML-based QSPR models are applicable to TPDs and support ML usage for TPDs' design, to potentially accelerate drug discovery.

MeSH terms

  • Animals
  • Cytochrome P-450 CYP3A / chemistry
  • Cytochrome P-450 CYP3A / metabolism
  • Humans
  • Machine Learning*
  • Permeability
  • Protein Binding
  • Proteolysis
  • Quantitative Structure-Activity Relationship
  • Rats

Substances

  • Cytochrome P-450 CYP3A