Background: Current predictive machine learning techniques for spontaneous preterm birth heavily rely on a history of previous preterm birth and/or costly techniques such as fetal fibronectin and ultrasound measurement of cervical length to the disadvantage of those considered at low risk and/or those who have no access to more expensive screening tools.
Aims and objectives: We aimed to develop a predictive model for spontaneous preterm delivery < 37 weeks using socio-demographic and clinical data readily available at booking -an approach which could be suitable for all women regardless of their previous obstetric history.
Methods: We developed a logistic regression model using seven feature variables derived from maternal socio-demographic and obstetric history from a preterm birth (n = 917) and a matched full-term (n = 100) cohort in 2018 and 2020 at a tertiary obstetric unit in the UK. A three-fold cross-validation technique was applied with subsets for data training and testing in Python® (version 3.8) using the most predictive factors. The model performance was then compared to the previously published predictive algorithms.
Results: The retrospective model showed good predictive accuracy with an AUC of 0.76 (95% CI: 0.71-0.83) for spontaneous preterm birth, with a sensitivity and specificity of 0.71 (95% CI: 0.66-0.76) and 0.78 (95% CI: 0.63-0.88) respectively based on seven variables: maternal age, BMI, ethnicity, smoking, gestational type, substance misuse and parity/obstetric history.
Conclusion: Pending further validation, our observations suggest that key maternal demographic features, incorporated into a traditional mathematical model, have promising predictive utility for spontaneous preterm birth in pregnant women in our region without the need for cervical length and/or fetal fibronectin.
Keywords: Logistic regression model; Machine learning; Prediction; Pregnancy; Preterm birth.
© 2024. Crown.