Machine Learning Clustering for Blood Pressure Variability Applied to Systolic Blood Pressure Intervention Trial (SPRINT) and the Hong Kong Community Cohort

Hypertension. 2020 Aug;76(2):569-576. doi: 10.1161/HYPERTENSIONAHA.119.14213. Epub 2020 Jun 29.

Abstract

Visit-to-visit blood pressure variability (BPV) has been shown to be a predictor of cardiovascular disease. We aimed to classify the BPV levels using different machine learning algorithms. Visit-to-visit blood pressure readings were extracted from the SPRINT study in the United States and eHealth cohort in Hong Kong (HK cohort). Patients were clustered into low, medium, and high BPV levels with the traditional quantile clustering and 5 machine learning algorithms including K-means. Clustering methods were assessed by Stability Index. Similarities were assessed by Davies-Bouldin Index and Silhouette Index. Cox proportional hazard regression models were fitted to compare the risk of myocardial infarction, stroke, and heart failure. A total of 8133 participants had average blood pressure measurement 14.7 times in 3.28 years in SPRINT and 1094 participants who had average blood pressure measurement 165.4 times in 1.37 years in HK cohort. Quantile clustering assigned one-third participants as high BPV level, but machine learning methods only assigned 10% to 27%. Quantile clustering is the most stable method (stability index: 0.982 in the SPRINT and 0.948 in the HK cohort) with some levels of clustering similarities (Davies-Bouldin Index: 0.752 and 0.764, respectively). K-means clustering is the most stable across the machine learning algorithms (stability index: 0.975 and 0.911, respectively) with the lowest clustering similarities (Davies-Bouldin Index: 0.653 and 0.680, respectively). One out of 7 in the population was classified with high BPV level, who showed to have higher risk of stroke and heart failure. Machine learning methods can improve BPV classification for better prediction of cardiovascular diseases.

Keywords: heart failure; hypertension; machine learning; personalized risk; stroke.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Aged
  • Aged, 80 and over
  • Algorithms
  • Blood Pressure / physiology*
  • Cardiovascular Diseases / diagnosis*
  • Cardiovascular Diseases / physiopathology
  • Cluster Analysis
  • Female
  • Hongkong
  • Humans
  • Hypertension / diagnosis*
  • Hypertension / physiopathology
  • Machine Learning*
  • Male
  • Middle Aged
  • Risk Factors