Hepatorenal syndrome (HRS) is a key contributor to poor prognosis in liver cirrhosis. This study aims to leverage the database to build a predictive model for early identification of high-risk patients. From two sizable public databases, we retrieved pertinent information about the cirrhosis patients' therapies, comorbidities, laboratory results, and demographics. Patients from the eICU database served as a test set for external validation, while patients from the MIMIC database were divided into training and validation groups. Variables were screened using LASSO regression, Extreme Gradient Boosting (XG Boost), and Random Forest (RF). Core risk factors were determined from the intersection of the three methods. A predictive model was constructed using multivariable logistic regression and visualized via a nomogram. Model performance was assessed using ROC curves, decision curve analysis (DCA), clinical impact curves (CIC), and calibration curves. Eight critical variables associated with HRS were identified using machine learning methods. The final predictive model, based on five key variables-spontaneous bacterial peritonitis, red blood cell count, creatinine, activated partial thromboplastin time, and total bilirubin-showed excellent discrimination, with AUCs of 0.832 (95% CI 0.8069-0.8563) in the training set and 0.8415 (95% CI 0.8042-0.8789) in the validation set. The AUC in the external test set was 0.8212 (95% CI 0.7784-0.864). By integrating the MIMIC-IV database and machine learning algorithms, we developed an effective predictive model for HRS in liver cirrhosis patients, providing a robust tool for early clinical intervention.
Keywords: Cirrhosis; Extreme gradient boosting; Hepatorenal syndrome; LASSO regression; MIMIC-IV database; Random forest.
© 2025. The Author(s).