Machine Learning Models Decoding the Association Between Urinary Stone Diseases and Metabolic Urinary Profiles

Metabolites. 2024 Dec 3;14(12):674. doi: 10.3390/metabo14120674.

Abstract

Background: Employing advanced machine learning models, we aim to identify biomarkers for urolithiasis from 24-h metabolic urinary abnormalities and study their associations with urinary stone diseases. Methods: We retrospectively recruited 468 patients at Peking Union Medical College Hospital who were diagnosed with urinary stone disease, including renal, ureteral, and multiple location stones, and had undergone a 24-h urine metabolic evaluation. We applied machine learning methods to identify biomarkers of urolithiasis from the urinary metabolite profiles. In total, 148 (34.02%) patients were with kidney stones, 34 (7.82%) with ureter stones, and 163 (34.83%) with multiple location stones, all of whom had detailed urinary metabolite data. Our analyses revealed that the Random Forest algorithm exhibited the highest predictive accuracy, with AUC values of 0.809 for kidney stones, 0.99 for ureter stones, and 0.775 for multiple location stones. The Super Learner Ensemble Method also demonstrated high predictive performance with slightly lower AUC values compared to Random Forest. Further analysis using multivariate logistic regression identified significant features for each stone type based on the Random Forest method. Results: We found that 24-h urinary magnesium was positively associated with both kidney stones and multiple location stones (OR = 1.195 [1.06-1.3525] and 1.3258 [1.1814-1.4949]) due to its high correlation with urinary phosphorus, while 24-h urinary creatinine was a protective factor for kidney stones and ureter stones, with ORs of 0.9533 [0.9117-0.996] and 0.8572 [0.8182-0.8959]. eGFR was a risk factor for ureter stones and multiple location stones, with ORs of 1.0145 [1.0084-1.0209] and 1.0148 [1.0077-1.0223]. Conclusion: Machine learning techniques show promise in revealing the links between urological stone disease and 24-h urinary metabolic data. Enhancing the prediction accuracy of these models leads to improved dietary or pharmacological prevention strategies.

Keywords: biomarkers; machine learning; metabolites; random forest; urinary stone diseases.