Prediction of Serum Creatinine in Hemodialysis Patients Using a Kernel Approach for Longitudinal Data

Healthc Inform Res. 2020 Apr;26(2):112-118. doi: 10.4258/hir.2020.26.2.112. Epub 2020 Apr 30.

Abstract

Objectives: Longitudinal data are prevalent in clinical research; due to their correlated nature, special analysis must be used for this type of data. Creatinine is an important marker in predicting end-stage renal disease, and it is recorded longitudinally. This study compared the prediction performance of linear regression (LR), linear mixed-effects model (LMM), least-squares support vector regression (LS-SVR), and mixed-effects least-squares support vector regression (MLS-SVR) methods to predict serum creatinine as a longitudinal outcome.

Methods: We used a longitudinal dataset of hemodialysis patients in Hamadan city between 2013 and 2016. To evaluate the performance of the methods in serum creatinine prediction, the data was divided into two sets of training and testing samples. Then LR, LMM, LS-SVR, and MLS-SVR were fitted. The prediction performance was assessed and compared in terms of mean squared error (MSE), mean absolute error (MAE), mean absolute prediction error (MAPE), and determination coefficient (R 2). Variable importance was calculated using the best model to select the most important predictors.

Results: The MLS-SVR outperformed the other methods in terms of the least prediction error; MSE = 1.280, MAE = 0.833, and MAPE = 0.129 for the training set and MSE = 3.275, MAE = 1.319, and MAPE = 0.159 for the testing set. Also, the MLS-SVR had the highest R 2, 0.805 and 0.654 for both the training and testing samples, respectively. Blood urea nitrogen was the most important factor in the prediction of creatinine.

Conclusions: The MLS-SVR achieved the best serum creatinine prediction performance in comparison to LR, LMM, and LS-SVR.

Keywords: Creatinine; Longitudinal Studies; Machine Learning; Renal Dialysis; Support Vector Machine.