Semi-supervised meta-learning elucidates understudied molecular interactions

Commun Biol. 2024 Sep 9;7(1):1104. doi: 10.1038/s42003-024-06797-z.

Abstract

Many biological problems are understudied due to experimental limitations and human biases. Although deep learning is promising in accelerating scientific discovery, its power compromises when applied to problems with scarcely labeled data and data distribution shifts. We develop a deep learning framework-Meta Model Agnostic Pseudo Label Learning (MMAPLE)-to address these challenges by effectively exploring out-of-distribution (OOD) unlabeled data when conventional transfer learning fails. The uniqueness of MMAPLE is to integrate the concept of meta-learning, transfer learning and semi-supervised learning into a unified framework. The power of MMAPLE is demonstrated in three applications in an OOD setting where chemicals or proteins in unseen data are dramatically different from those in training data: predicting drug-target interactions, hidden human metabolite-enzyme interactions, and understudied interspecies microbiome metabolite-human receptor interactions. MMAPLE achieves 11% to 242% improvement in the prediction-recall on multiple OOD benchmarks over various base models. Using MMAPLE, we reveal novel interspecies metabolite-protein interactions that are validated by activity assays and fill in missing links in microbiome-human interactions. MMAPLE is a general framework to explore previously unrecognized biological domains beyond the reach of present experimental and computational techniques.

MeSH terms

  • Computational Biology / methods
  • Deep Learning
  • Humans
  • Microbiota
  • Supervised Machine Learning*