Objective: This study aimed to develop machine learning (ML) models for predicting preterm preeclampsia using the information available before 23 weeks gestation.
Study design: This was a secondary analysis of the Nulliparous Pregnancy Outcomes Study: Monitoring Mothers-to-Be (nuMoM2b) cohort. We considered 131 features available before 23 weeks including maternal demographics, obstetrics and family history, social determinants of health, physical activity, nutrition, and early second-trimester ultrasound. Our primary outcome was preterm preeclampsia before 37 weeks. The dataset was randomly split into a training set (70%) and a validation set (30%). ML models using glmnet, multilayer perceptron, random forest, XGBoost (extreme gradient boosting), and LightGBM models were developed. Using the ML approach that achieved the best area under the curve (AUC), we developed the final model. Further feature selection was conducted from the top 25 important features based on SHapley Additive exPlanations (SHAP) values. The performance of the final model was assessed using the validation dataset.
Results: Of 9,467 individuals, 219 (2.3%) had preterm preeclampsia. The AUC of the XGBoost model was the highest (AUC = 0.749 [95% confidence interval (95% CI), 0.736-0.762]) compared with other models. Therefore, XGBoost was used to develop models using fewer variables. The XGBoost model with the eight features (in order of importance: mean uterine artery pulsatility index in the early second trimester, chronic hypertension, pregestational diabetes, uterine artery notch, systolic and diastolic blood pressure in the first trimester, body mass index, and maternal age) was chosen as the final model as it had an AUC of 0.741 (95% CI, 0.730-0.752) which was not inferior to the original model (p = 0.58). The final model in the validation dataset had an AUC of 0.779 (95% CI, 0.722-0.831). An online application of the final model was developed ( https://kawakita.shinyapps.io/Preterm_preeclampsia/ ).
Conclusion: ML algorithms using information available before 23 weeks can accurately predict preterm preeclampsia before 37 weeks.
Key points: · Prediction models using uterine artery Doppler have not been adopted in the US.. · We developed a model using an ML algorithm.. · An online application of the final model was developed.. · ML algorithms using information available before 23 weeks can accurately predict preterm preeclampsia before 37 weeks..
Thieme. All rights reserved.