Development and validation of a machine learning-based approach to identify high-risk diabetic cardiomyopathy phenotype

Eur J Heart Fail. 2024 Oct;26(10):2183-2192. doi: 10.1002/ejhf.3443. Epub 2024 Sep 6.

Abstract

Aims: Abnormalities in specific echocardiographic parameters and cardiac biomarkers have been reported among individuals with diabetes. However, a comprehensive characterization of diabetic cardiomyopathy (DbCM), a subclinical stage of myocardial abnormalities that precede the development of clinical heart failure (HF), is lacking. In this study, we developed and validated a machine learning-based clustering approach to identify the high-risk DbCM phenotype based on echocardiographic and cardiac biomarker parameters.

Methods and results: Among individuals with diabetes from the Atherosclerosis Risk in Communities (ARIC) cohort who were free of cardiovascular disease and other potential aetiologies of cardiomyopathy (training, n = 1199), unsupervised hierarchical clustering was performed using echocardiographic parameters and cardiac biomarkers of neurohormonal stress and chronic myocardial injury (total 25 variables). The high-risk DbCM phenotype was identified based on the incidence of HF on follow-up. A deep neural network (DeepNN) classifier was developed to predict DbCM in the ARIC training cohort and validated in an external community-based cohort (Cardiovascular Health Study [CHS]; n = 802) and an electronic health record (EHR) cohort (n = 5071). Clustering identified three phenogroups in the derivation cohort. Phenogroup-3 (n = 324, 27% of the cohort) had significantly higher 5-year HF incidence than other phenogroups (12.1% vs. 4.6% [phenogroup 2] vs. 3.1% [phenogroup 1]) and was identified as the high-risk DbCM phenotype. The key echocardiographic predictors of high-risk DbCM phenotype were higher NT-proBNP levels, increased left ventricular mass and left atrial size, and worse diastolic function. In the CHS and University of Texas (UT) Southwestern EHR validation cohorts, the DeepNN classifier identified 16% and 29% of participants with DbCM, respectively. Participants with (vs. without) high-risk DbCM phenotype in the external validation cohorts had a significantly higher incidence of HF (hazard ratio [95% confidence interval] 1.61 [1.18-2.19] in CHS and 1.34 [1.08-1.65] in the UT Southwestern EHR cohort).

Conclusion: Machine learning-based techniques may identify 16% to 29% of individuals with diabetes as having a high-risk DbCM phenotype who may benefit from more aggressive implementation of HF preventive strategies.

Keywords: Diabetic cardiomyopathy; Heart failure; Type 2 diabetes mellitus.

Publication types

  • Validation Study

MeSH terms

  • Aged
  • Biomarkers / blood
  • Diabetic Cardiomyopathies* / diagnosis
  • Diabetic Cardiomyopathies* / epidemiology
  • Diabetic Cardiomyopathies* / etiology
  • Echocardiography* / methods
  • Female
  • Heart Failure / diagnosis
  • Heart Failure / epidemiology
  • Heart Failure / etiology
  • Humans
  • Incidence
  • Machine Learning*
  • Male
  • Middle Aged
  • Phenotype*
  • Risk Assessment / methods
  • Risk Factors
  • United States / epidemiology

Substances

  • Biomarkers