Non-Negative Matrix Tri-Factorization for Representation Learning in Multi-Omics Datasets with Applications to Drug Repurposing and Selection

Letizia Messa; Carolina Testa; Stephana Carelli; Federica Rey; Emanuela Jacchetti; Cristina Cereda; Manuela Teresa Raimondi; Stefano Ceri; Pietro Pinoli

doi:10.3390/ijms25179576

Non-Negative Matrix Tri-Factorization for Representation Learning in Multi-Omics Datasets with Applications to Drug Repurposing and Selection

Int J Mol Sci. 2024 Sep 4;25(17):9576. doi: 10.3390/ijms25179576.

Authors

Letizia Messa¹, Carolina Testa¹, Stephana Carelli^{2

3}, Federica Rey³, Emanuela Jacchetti⁴, Cristina Cereda², Manuela Teresa Raimondi⁴, Stefano Ceri¹, Pietro Pinoli¹

Affiliations

¹ Department of Electronics, Information and Bioengineering (DEIB), Politecnico di Milano, 20133 Milan, Italy.
² Center of Functional Genomics and Rare Diseases, Buzzi Children's Hospital, 20154 Milan, Italy.
³ Pediatric Clinical Research Center "Fondazione Romeo ed Enrica Invernizzi", Department of Biomedical and Clinical Sciences, Università degli Studi di Milano, 20157 Milan, Italy.
⁴ Department of Chemistry, Materials and Chemical Engineering "Giulio Natta", Politecnico di Milano, 20133 Milan, Italy.

Abstract

The vast corpus of heterogeneous biomedical data stored in databases, ontologies, and terminologies presents a unique opportunity for drug design. Integrating and fusing these sources is essential to develop data representations that can be analyzed using artificial intelligence methods to generate novel drug candidates or hypotheses. Here, we propose Non-Negative Matrix Tri-Factorization as an invaluable tool for integrating and fusing data, as well as for representation learning. Additionally, we demonstrate how representations learned by Non-Negative Matrix Tri-Factorization can effectively be utilized by traditional artificial intelligence methods. While this approach is domain-agnostic and applicable to any field with vast amounts of structured and semi-structured data, we apply it specifically to computational pharmacology and drug repurposing. This field is poised to benefit significantly from artificial intelligence, particularly in personalized medicine. We conducted extensive experiments to evaluate the performance of the proposed method, yielding exciting results, particularly compared to traditional methods. Novel drug-target predictions have also been validated in the literature, further confirming their validity. Additionally, we tested our method to predict drug synergism, where constructing a classical matrix dataset is challenging. The method demonstrated great flexibility, suggesting its applicability to a wide range of tasks in drug design and discovery.

Keywords: data integration; drug repurposing; drug selection; machine learning; personalized medicine; representation learning.

MeSH terms

Algorithms
Artificial Intelligence
Computational Biology / methods
Drug Discovery / methods
Drug Repositioning* / methods
Humans
Machine Learning
Multiomics

Grants and funding

This paper was supported by the PNRR-PE-AI FAIR project funded by the NextGeneration EU program.