The vast corpus of heterogeneous biomedical data stored in databases, ontologies, and terminologies presents a unique opportunity for drug design. Integrating and fusing these sources is essential to develop data representations that can be analyzed using artificial intelligence methods to generate novel drug candidates or hypotheses. Here, we propose Non-Negative Matrix Tri-Factorization as an invaluable tool for integrating and fusing data, as well as for representation learning. Additionally, we demonstrate how representations learned by Non-Negative Matrix Tri-Factorization can effectively be utilized by traditional artificial intelligence methods. While this approach is domain-agnostic and applicable to any field with vast amounts of structured and semi-structured data, we apply it specifically to computational pharmacology and drug repurposing. This field is poised to benefit significantly from artificial intelligence, particularly in personalized medicine. We conducted extensive experiments to evaluate the performance of the proposed method, yielding exciting results, particularly compared to traditional methods. Novel drug-target predictions have also been validated in the literature, further confirming their validity. Additionally, we tested our method to predict drug synergism, where constructing a classical matrix dataset is challenging. The method demonstrated great flexibility, suggesting its applicability to a wide range of tasks in drug design and discovery.
Keywords: data integration; drug repurposing; drug selection; machine learning; personalized medicine; representation learning.