Objectives: Sufficient attention has not been given to machine learning (ML) models using longitudinal data for investigating important predictors of new onset of hypertension. We investigated the predictive ability of several ML models for the development of hypertension.
Methods: A total of 15 965 Japanese participants (men/women: 9,466/6,499, mean age: 45 years) who received annual health examinations were randomly divided into a training group (70%, n = 11,175) and a test group (30%, n = 4,790). The predictive abilities of 58 candidates including fatty liver index (FLI), which is calculated by using body mass index, waist circumference and levels of γ-glutamyl transferase and triglycerides, were investigated by statistics analogous to the area under the curve (AUC) in receiver operating characteristic curve analyses using ML models including logistic regression, random forest, naïve Bayes, extreme gradient boosting and artificial neural network.
Results: During a 10-year period (mean period: 6.1 years), 2,132 subjects (19.1%) in the training group and 917 subjects (19.1%) in the test group had new onset of hypertension. Among the 58 parameters, systolic blood pressure, age and FLI were identified as important candidates by random forest feature selection with 10-fold cross-validation. The AUCs of ML models were 0.765-0.825, and discriminatory capacity was significantly improved in the artificial neural network model compared to that in the logistic regression model.
Conclusions: The development of hypertension can be simply and accurately predicted by each ML model using systolic blood pressure, age and FLI as selected features. By building multiple ML models, more practical prediction might be possible.
Keywords: Artificial intelligence; fatty liver index; hypertension; machine learning.