Fairness gaps in Machine learning models for hospitalization and emergency department visit risk prediction in home healthcare patients with heart failure

Int J Med Inform. 2024 Nov:191:105534. doi: 10.1016/j.ijmedinf.2024.105534. Epub 2024 Jun 30.

Abstract

Objectives: This study aims to evaluate the fairness performance metrics of Machine Learning (ML) models to predict hospitalization and emergency department (ED) visits in heart failure patients receiving home healthcare. We analyze biases, assess performance disparities, and propose solutions to improve model performance in diverse subpopulations.

Methods: The study used a dataset of 12,189 episodes of home healthcare collected between 2015 and 2017, including structured (e.g., standard assessment tool) and unstructured data (i.e., clinical notes). ML risk prediction models, including Light Gradient-boosting model (LightGBM) and AutoGluon, were developed using demographic information, vital signs, comorbidities, service utilization data, and the area deprivation index (ADI) associated with the patient's home address. Fairness metrics, such as Equal Opportunity, Predictive Equality, Predictive Parity, and Statistical Parity, were calculated to evaluate model performance across subpopulations.

Results: Our study revealed significant disparities in model performance across diverse demographic subgroups. For example, the Hispanic, Male, High-ADI subgroup excelled in terms of Equal Opportunity with a metric value of 0.825, which was 28% higher than the lowest-performing Other, Female, Low-ADI subgroup, which scored 0.644. In Predictive Parity, the gap between the highest and lowest-performing groups was 29%, and in Statistical Parity, the gap reached 69%. In Predictive Equality, the difference was 45%.

Discussion and conclusion: The findings highlight substantial differences in fairness metrics across diverse patient subpopulations in ML risk prediction models for heart failure patients receiving home healthcare services. Ongoing monitoring and improvement of fairness metrics are essential to mitigate biases.

Keywords: Bias; Healthcare Disparities; Heart Failure; Home Care Services; Machine Learning; Socioeconomic Factors.

MeSH terms

  • Aged
  • Aged, 80 and over
  • Emergency Room Visits
  • Emergency Service, Hospital* / statistics & numerical data
  • Female
  • Heart Failure* / therapy
  • Home Care Services* / statistics & numerical data
  • Hospitalization* / statistics & numerical data
  • Humans
  • Machine Learning*
  • Male
  • Middle Aged
  • Risk Assessment