Multi‑label classification of biomedical data

Io Diakou; Eddie Iliopoulos; Eleni Papakonstantinou; Konstantina Dragoumani; Christos Yapijakis; Costas Iliopoulos; Demetrios A Spandidos; George P Chrousos; Elias Eliopoulos; Dimitrios Vlachakis

doi:10.3892/mi.2024.192

Multi‑label classification of biomedical data

Med Int (Lond). 2024 Sep 9;4(6):68. doi: 10.3892/mi.2024.192. eCollection 2024 Nov-Dec.

Authors

Affiliations

¹ Laboratory of Genetics, Department of Biotechnology, School of Applied Biology and Biotechnology, Agricultural University of Athens, 11855 Athens, Greece.
² University Research Institute of Maternal and Child Health and Precision Medicine, National and Kapodistrian University of Athens, 'Aghia Sophia' Children's Hospital, 11527 Athens, Greece.
³ School of Informatics, Faculty of Natural and Mathematical Sciences, King's College London, London WC2R 2LS, UK.
⁴ Laboratory of Clinical Virology, School of Medicine, University of Crete, 71003 Heraklion, Greece.

Abstract

Biomedical datasets constitute a rich source of information, containing multivariate data collected during medical practice. In spite of inherent challenges, such as missing or imbalanced data, these types of datasets are increasingly utilized as a basis for the construction of predictive machine-learning models. The prediction of disease outcomes and complications could inform the process of decision-making in the hospital setting and ensure the best possible patient management according to the patient's features. Multi-label classification algorithms, which are trained to assign a set of labels to input samples, can efficiently tackle outcome prediction tasks. Myocardial infarction (MI) represents a widespread health risk, accounting for a significant portion of heart disease-related mortality. Moreover, the danger of potential complications occurring in patients with MI during their period of hospitalization underlines the need for systems to efficiently assess the risks of patients with MI. In order to demonstrate the critical role of applying machine-learning methods in medical challenges, in the present study, a set of multi-label classifiers was evaluated on a public dataset of MI-related complications to predict the outcomes of hospitalized patients with MI, based on a set of input patient features. Such methods can be scaled through the use of larger datasets of patient records, along with fine-tuning for specific patient sub-groups or patient populations in specific regions, to increase the performance of these approaches. Overall, a prediction system based on classifiers trained on patient records may assist healthcare professionals in providing personalized care and efficient monitoring of high-risk patient subgroups.

Keywords: biomedical datasets; complication prediction; label graph; multi-label classification; myocardial infarction; precision medicine.

Grants and funding

Funding: The authors would like to acknowledge funding from the following: i) AdjustEBOVGP-Dx (RIA2018EF-2081): Biochemical Adjustments of native EBOV Glycoprotein in Patient Sample to Unmask target Epitopes for Rapid Diagnostic Testing. A European and Developing Countries Clinical Trials Partnership (EDCTP2) under the Horizon 2020 ‘Research and Innovation Actions’ DESCA; and ii) ‘MilkSafe: A novel pipeline to enrich formula milk using omics technologies’, a research co-financed by the European Regional Development Fund of the European Union and Greek national funds through the Operational Program Competitiveness, Entrepreneurship and Innovation, under the call RESEARCH-CREATE-INNOVATE (project code: T2EDK-02222).