Proteins associated with multiple diseases often interact, forming disease modules that are critical for understanding disease mechanisms. This study integrates protein-protein interactions (PPIs) and Gene Ontology data using non-negative matrix factorization (NMF) to identify gene modules associated with human diseases. We leverage two biological sources of information, protein-protein interactions (PPIs) and Gene Ontology data, to find connections between novel genes and diseases. The data sources are first converted into networks, which are then clustered to obtain modules. Two types of modules are then integrated through an NMF-based technique to obtain a set of meta-modules that preserve the essential characteristics of interaction patterns and functional similarity information among the proteins/genes. Each meta-module is labeled based on its statistical and biological properties, and a multi-label classification technique is employed to assign new disease labels to genes. We identified 3,131 gene-disease associations, validated through a literature review, Gene Ontology, and pathway analysis.
Copyright: © 2024 Alberuni, Ray. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.