Development and validation of 10-year risk prediction models of cardiovascular disease in Chinese type 2 diabetes mellitus patients in primary care using interpretable machine learning-based methods

Diabetes Obes Metab. 2024 Sep;26(9):3969-3987. doi: 10.1111/dom.15745. Epub 2024 Jul 15.

Abstract

Aim: To develop 10-year cardiovascular disease (CVD) risk prediction models in Chinese patients with type 2 diabetes mellitus (T2DM) managed in primary care using machine learning (ML) methods.

Methods: In this 10-year population-based retrospective cohort study, 141 516 Chinese T2DM patients aged 18 years or above, without history of CVD or end-stage renal disease and managed in public primary care clinics in 2008, were included and followed up until December 2017. Two-thirds of the patients were randomly selected to develop sex-specific CVD risk prediction models. The remaining one-third of patients were used as the validation sample to evaluate the discrimination and calibration of the models. ML-based methods were applied to missing data imputation, predictor selection, risk prediction modelling, model interpretation, and model evaluation. Cox regression was used to develop the statistical models in parallel for comparison.

Results: During a median follow-up of 9.75 years, 32 445 patients (22.9%) developed CVD. Age, T2DM duration, urine albumin-to-creatinine ratio (ACR), estimated glomerular filtration rate (eGFR), systolic blood pressure variability and glycated haemoglobin (HbA1c) variability were the most important predictors. ML models also identified nonlinear effects of several predictors, particularly the U-shaped effects of eGFR and body mass index. The ML models showed a Harrell's C statistic of >0.80 and good calibration. The ML models performed significantly better than the Cox regression models in CVD risk prediction and achieved better risk stratification for individual patients.

Conclusion: Using routinely available predictors and ML-based algorithms, this study established 10-year CVD risk prediction models for Chinese T2DM patients in primary care. The findings highlight the importance of renal function indicators, and variability in both blood pressure and HbA1c as CVD predictors, which deserve more clinical attention. The derived risk prediction tools have the potential to support clinical decision making and encourage patients towards self-care, subject to further research confirming the models' feasibility, acceptability and applicability at the point of care.

Keywords: cardiovascular disease; primary care; real‐world evidence; type 2 diabetes.

Publication types

  • Validation Study

MeSH terms

  • Adult
  • Aged
  • Cardiovascular Diseases* / epidemiology
  • Cardiovascular Diseases* / etiology
  • China / epidemiology
  • Diabetes Mellitus, Type 2* / complications
  • East Asian People
  • Female
  • Follow-Up Studies
  • Humans
  • Machine Learning*
  • Male
  • Middle Aged
  • Primary Health Care*
  • Retrospective Studies
  • Risk Assessment / methods
  • Risk Factors