Study objectives: This study aimed to identify the risk factors associated with falls in hospitalized patients, develop a predictive risk model using machine learning algorithms, and evaluate the validity of the model's predictions.
Study design: A cross-sectional design was employed using data from the DRYAD public database.
Research methods: The study utilized data from the Fukushima Medical University Hospital Cohort Study, obtained from the DRYAD public database. 20% of the dataset was allocated as an independent test set, while the remaining 80% was utilized for training and validation. To address data imbalance in binary variables, the Synthetic Minority Oversampling Technique combined with Edited Nearest Neighbors (SMOTE-ENN) was applied. Univariate analysis and least absolute shrinkage and selection operator (LASSO) regression were used to analyze and screen variables. Predictive models were constructed by integrating key clinical features, and eight machine learning algorithms were evaluated to identify the most effective model. Additionally, SHAP (Shapley Additive Explanations) was used to interpret the predictive models and rank the importance of risk factors.
Results: The final model included the following variables: Adl_standing, Adl_evacuation, Age_group, Planned_surgery, Wheelchair, History_of_falls, Hypnotic_drugs, Psychotropic_drugs, and Remote_caring_system. Among the evaluated models, the Random Forest algorithm demonstrated superior performance, achieving an AUC of 0.814 (95% CI: 0.802-0.827) in the training set, 0.781 (95% CI: 0.740-0.821) in the validation set, and 0.795 (95% CI: 0.770-0.820) in the test set.
Conclusion: Machine learning algorithms, particularly Random Forest, are effective in predicting fall risk among hospitalized patients. These findings can significantly enhance fall prevention strategies within healthcare settings.
Keywords: Accidental falls; Hospitalized patients; Machine learning; Model interpretation; Predictive modeling; Risk factors.
© 2025. The Author(s).