Efficacy of three predictive models for deep vein thrombosis in patients with lumbar disc herniation

Am J Transl Res. 2024 Dec 15;16(12):7438-7447. doi: 10.62347/TWTG6803. eCollection 2024.

Abstract

Objective: To develop predictive models for assessing deep vein thrombosis (DVT) risk among lumbar disc herniation (LDH) patients and evaluate their performances.

Methods: A retrospective study was conducted on 798 LDH patients treated at the First Hospital of Hebei Medical University from January 2017 to December 2023. The patients were divided into a training set (n = 558) and a test set (n = 240) using computer-generated random numbers in a ratio of 7:3. Patients without DVT in the training set were categorized as the non-DVT group (n = 463), while those diagnosed with DVT were the DVT group (n = 95). Univariate analysis was performed to compare clinical data between the two groups. Data with statistical significance were used for the development of a Logistic regression model, Gradient boosting model, and Random Forest model. Model performance was evaluated through receiver operating characteristic (ROC) curve analysis and calibration curve assessment.

Results: In the training set, univariate analysis revealed significant differences in age, platelets (PLT), cholesterol (TC), triglycerides (TG), glycated hemoglobin (HbAlc), D-dimer (D-D), fibrinogen (FIB), activated partial thromboplastin time (APTT), prothrombin time (PT), and thrombin time (TT) between the non-DVT group and the DVT group (all P<0.05). Predictive models were constructed based on these indicators. The areas under the ROC curves (AUCs) in the training set were as follows (in descending order): Random Forest model (0.978) > Gradient boosting model (0.943) > Logistic regression model (0.919). In the test set, the AUCs were: Random Forest model (0.952) > Gradient boosting model (0.941) > Logistic regression model (0.908). The DeLong test indicated that the AUC of the Random Forest model in the training set was significantly higher than that of the Logistic regression model (P<0.05); however, no significant difference was observed between the other two models. Calibration curves demonstrated that the predictive probabilities from all three models closely aligned with actual DVT incidence in both sets.

Conclusion: The Logistic regression model, Gradient boosting model, and Random Forest model constructed in this study exhibit good predictive value for the occurrence of DVT in LDH patients, aiding in the optimization of clinical management of clinical management. Among them, the Random Forest model performed the best of the three.

Keywords: Gradient boosting model; Logistic regression model; Lumbar disc herniation; Random Forest model; deep vein thrombosis; risk factor.