Fair and explainable Myocardial Infarction (MI) prediction: Novel strategies for feature selection and class imbalance correction

Comput Biol Med. 2025 Jan:184:109413. doi: 10.1016/j.compbiomed.2024.109413. Epub 2024 Nov 29.

Abstract

The rising incidences of myocardial infarction (MI), often affecting individuals without traditional risk factors, highlight the urgent need for improved early detection using personal health data. However, health surveys and electronic health records (EHRs) frequently suffer from class imbalances, leading to prediction biases and differences between specificity and sensitivity, which hinder reliable model development despite the valuable insights contained in these datasets. To address this, we have introduced a novel approach to enhance MI risk prediction using self-reported attributes from the Behavioral Risk Factor Surveillance System (BRFSS) and the National Health Interview Survey (NHIS) dataset. Our approach incorporates three innovative techniques: the Dual-Path Artificial Neural Network (DP-ANN) to mitigate biased decision making across imbalanced datasets, the Triple Criteria Selection (TCS) for unbiased feature selection, and Minority Weighted Sampling (MWS) to tackle challenges of uncontrolled minority class sampling. These methods collectively enhance MI prediction and feature relevance. The DP-ANN model has achieved balanced performance, with an average specificity of 80%, sensitivity of 82%, and AUC-ROC of 89.5%, improving imbalance variance by approximately 14.96% compared to prior studies. By outperforming other models across four heavily imbalanced datasets, our approach demonstrates robustness and generalizability. Additionally, SHapley Additive exPlanations (SHAP) analysis has revealed key predictors and risk factors for MI, such as coronary heart disease and bronchitis in females, and stroke among individuals aged 35-54. In conclusion, our study provides a robust model for healthcare professionals to assess MI risk through targeted factors, promoting early detection and potentially improving patient outcomes.

Keywords: Behavioral Risk Factor Surveillance System (BRFSS); Explainable AI (XAI); Imbalance correction; Myocardial Infarction (MI); National Health Interview Survey (NHIS).

MeSH terms

  • Adult
  • Aged
  • Electronic Health Records
  • Female
  • Humans
  • Male
  • Middle Aged
  • Myocardial Infarction*
  • Neural Networks, Computer*
  • Risk Factors