A review of feature selection strategies utilizing graph data structures and Knowledge Graphs

Sisi Shao; Pedro Henrique Ribeiro; Christina M Ramirez; Jason H Moore

doi:10.1093/bib/bbae521

A review of feature selection strategies utilizing graph data structures and Knowledge Graphs

Brief Bioinform. 2024 Sep 23;25(6):bbae521. doi: 10.1093/bib/bbae521.

Authors

Sisi Shao¹, Pedro Henrique Ribeiro², Christina M Ramirez¹, Jason H Moore^{1

2}

Affiliations

¹ Department of Biostatistics, Fielding School of Public Health at University of California, Los Angeles, 650 Charles E Young Dr S, Los Angeles, CA 90095-1772, United States.
² Department of Computational Biomedicine, Cedars-Sinai Medical Center, 8700 Beverly Blvd, Los Angeles, CA 90048, United States.

Abstract

Feature selection in Knowledge Graphs (KGs) is increasingly utilized in diverse domains, including biomedical research, Natural Language Processing (NLP), and personalized recommendation systems. This paper delves into the methodologies for feature selection (FS) within KGs, emphasizing their roles in enhancing machine learning (ML) model efficacy, hypothesis generation, and interpretability. Through this comprehensive review, we aim to catalyze further innovation in FS for KGs, paving the way for more insightful, efficient, and interpretable analytical models across various domains. Our exploration reveals the critical importance of scalability, accuracy, and interpretability in FS techniques, advocating for the integration of domain knowledge to refine the selection process. We highlight the burgeoning potential of multi-objective optimization and interdisciplinary collaboration in advancing KG FS, underscoring the transformative impact of such methodologies on precision medicine, among other fields. The paper concludes by charting future directions, including the development of scalable, dynamic FS algorithms and the integration of explainable AI principles to foster transparency and trust in KG-driven models.

Keywords: Knowledge Graphs; deep learning; explainable AI; feature selection; precision medicine.

Publication types

Review

MeSH terms

Algorithms*
Humans
Machine Learning*
Natural Language Processing*

Grants and funding

U01AG066833/NH/NIH HHS/United States