Aims: To identify clinically meaningful clusters of patients with similar glycated hemoglobin (HbA1c) trajectories among patients with type 2 diabetes.
Methods: A retrospective cohort study using unsupervised machine learning clustering methodologies to determine clusters of patients with similar longitudinal HbA1c trajectories. Stability of these clusters was assessed and supervised random forest analysis verified the clusters' reproducibility. Clinical relevance of the clusters was assessed through multivariable analysis, comparing differences in risk for a composite outcome (macrovascular and microvascular outcomes, hypoglycemic events, and all-cause mortality) at HbA1c thresholds for each cluster.
Results: Among 60,423 patients, three clusters of HbA1c trajectories were generated: stable (n = 45,679), descending (n = 6,084), and ascending (n = 8,660) trends, which were reproduced with 99.8% accuracy using a random forest model. In the clinical relevance assessment, HbA1c levels demonstrated a J-shape association with the risk for outcomes. HbA1c level thresholds for minimizing outcomes' risk differed by cluster: 6.0-6.4% for the stable cluster, <8.0% for the descending cluster, and <9.0 for the ascending cluster.
Conclusions: By applying unsupervised machine learning to longitudinal HbA1c trajectories, we have identified clusters of patients who have distinct risk for diabetes-related complications. These clusters can be the basis for developing individualized models to personalize glycemic targets.