Accurate prediction of all-cause mortality in patients with metabolic dysfunction-associated steatotic liver disease using electronic health records

Ann Hepatol. 2024 Sep-Oct;29(5):101528. doi: 10.1016/j.aohep.2024.101528. Epub 2024 Jul 4.

Abstract

Introduction and objectives: Despite the huge clinical burden of MASLD, validated tools for early risk stratification are lacking, and heterogeneous disease expression and a highly variable rate of progression to clinical outcomes result in prognostic uncertainty. We aimed to investigate longitudinal electronic health record-based outcome prediction in MASLD using a state-of-the-art machine learning model.

Patients and methods: n = 940 patients with histologically-defined MASLD were used to develop a deep-learning model for all-cause mortality prediction. Patient timelines, spanning 12 years, were fully-annotated with demographic/clinical characteristics, ICD-9 and -10 codes, blood test results, prescribing data, and secondary care activity. A Transformer neural network (TNN) was trained to output concomitant probabilities of 12-, 24-, and 36-month all-cause mortality. In-sample performance was assessed using 5-fold cross-validation. Out-of-sample performance was assessed in an independent set of n = 528 MASLD patients.

Results: In-sample model performance achieved AUROC curve 0.74-0.90 (95 % CI: 0.72-0.94), sensitivity 64 %-82 %, specificity 75 %-92 % and Positive Predictive Value (PPV) 94 %-98 %. Out-of-sample model validation had AUROC 0.70-0.86 (95 % CI: 0.67-0.90), sensitivity 69 %-70 %, specificity 96 %-97 % and PPV 75 %-77 %. Key predictive factors, identified using coefficients of determination, were age, presence of type 2 diabetes, and history of hospital admissions with length of stay >14 days.

Conclusions: A TNN, applied to routinely-collected longitudinal electronic health records, achieved good performance in prediction of 12-, 24-, and 36-month all-cause mortality in patients with MASLD. Extrapolation of our technique to population-level data will enable scalable and accurate risk stratification to identify people most likely to benefit from anticipatory health care and personalized interventions.

Keywords: Artificial intelligence; Deep Learning; Electronic health records; Metabolic dysfunction-associated steatotic liver disease; Prognostic model.

MeSH terms

  • Adult
  • Aged
  • Cause of Death
  • Deep Learning
  • Electronic Health Records*
  • Female
  • Humans
  • Male
  • Middle Aged
  • Neural Networks, Computer
  • Non-alcoholic Fatty Liver Disease / diagnosis
  • Non-alcoholic Fatty Liver Disease / mortality
  • Predictive Value of Tests
  • Prognosis
  • Retrospective Studies
  • Risk Assessment
  • Risk Factors