Objective: The purpose of this study was to apply clustering methods to identify and characterize prediabetes phenotypes and their relationships with treatment arm and type 2 diabetes (T2D) outcomes in the Diabetes Prevention Program (DPP), and to compare the utility of additional clustering measures in phenotype characterization and T2D risk stratification.
Research design and methods: This was a secondary analysis of data from a subset of participants (n=994) from the previously completed Diabetes Prevention Program trial. Unsupervised k-means clustering analysis was applied to derive the optimal number of clusters of participants based on common clinical risk factors alone or common risk factors plus more comprehensive measures of glucose tolerance and body composition.
Results: Five clusters were derived from common clinical characteristics and the addition of comprehensive measures of glucose tolerance and body composition. Within each modeling approach, participants show significantly different levels of risk factors. The clinical only model showed higher accuracy for time to T2D, however the more comprehensive models further differentiated a metabolically health overweight phenotype. For both models, the greatest differentiation in determining time to T2D was in the metformin arm of the trial.
Conclusions: Data driven clustering of patients with prediabetes allows for identification of prediabetes phenotypes at greater risk for disease progression and responses to risk reduction interventions. Further investigation into phenotypic differences in treatment response could enable better personalization of prediabetes and T2D prevention and treatment choices.