Identifying compound-protein interactions with knowledge graph embedding of perturbation transcriptomics

Cell Genom. 2024 Oct 9;4(10):100655. doi: 10.1016/j.xgen.2024.100655. Epub 2024 Sep 19.

Abstract

The emergence of perturbation transcriptomics provides a new perspective for drug discovery, but existing analysis methods suffer from inadequate performance and limited applicability. In this work, we present PertKGE, a method designed to deconvolute compound-protein interactions from perturbation transcriptomics with knowledge graph embedding. By considering multi-level regulatory events within biological systems that share the same semantic context, PertKGE significantly improves deconvoluting accuracy in two critical "cold-start" settings: inferring targets for new compounds and conducting virtual screening for new targets. We further demonstrate the pivotal role of incorporating multi-level regulatory events in alleviating representational biases. Notably, it enables the identification of ectonucleotide pyrophosphatase/phosphodiesterase-1 as the target responsible for the unique anti-tumor immunotherapy effect of tankyrase inhibitor K-756 and the discovery of five novel hits targeting the emerging cancer therapeutic target aldehyde dehydrogenase 1B1 with a remarkable hit rate of 10.2%. These findings highlight the potential of PertKGE to accelerate drug discovery.

Keywords: compound-protein interaction; drug discovery; knowledge graph embedding; machine learning; perturbation transcriptomics; target inference; virtual screening.

MeSH terms

  • Antineoplastic Agents / pharmacology
  • Antineoplastic Agents / therapeutic use
  • Drug Discovery / methods
  • Gene Expression Profiling / methods
  • Humans
  • Phosphoric Diester Hydrolases / genetics
  • Phosphoric Diester Hydrolases / metabolism
  • Tankyrases / antagonists & inhibitors
  • Tankyrases / genetics
  • Tankyrases / metabolism
  • Transcriptome*

Substances

  • Tankyrases
  • Phosphoric Diester Hydrolases
  • Antineoplastic Agents