Machine learning-based 2-year risk prediction tool in immunoglobulin A nephropathy

Yujeong Kim; Jong Hyun Jhee; Chan Min Park; Donghwan Oh; Beom Jin Lim; Hoon Young Choi; Dukyong Yoon; Hyeong Cheon Park

doi:10.23876/j.krcp.23.076

Machine learning-based 2-year risk prediction tool in immunoglobulin A nephropathy

Kidney Res Clin Pract. 2023 Oct 27. doi: 10.23876/j.krcp.23.076. Online ahead of print.

Authors

Yujeong Kim¹, Jong Hyun Jhee², Chan Min Park¹, Donghwan Oh², Beom Jin Lim³, Hoon Young Choi^{2

4}, Dukyong Yoon^{1

5}, Hyeong Cheon Park^{2

4}

Affiliations

¹ Department of Biomedical Systems Informatics, Yonsei University College of Medicine, Yongin, Republic of Korea.
² Division of Nephrology, Department of Internal Medicine, Gangnam Severance Hospital, Yonsei University College of Medicine, Seoul, Republic of Korea.
³ Department of Pathology, Yonsei University College of Medicine, Seoul, Republic of Korea.
⁴ Severance Institute for Vascular and Metabolic Research, Yonsei University College of Medicine, Seoul, Republic of Korea.
⁵ Center for Digital Health, Yongin Severance Hospital, Yonsei University Health System, Yongin, Republic of Korea.

PMID: 37919889
DOI: 10.23876/j.krcp.23.076

Abstract

Background: This study aimed to develop a machine learning-based 2-year risk prediction model for early identification of patients with rapid progressive immunoglobulin A nephropathy (IgAN). We also assessed the model's performance to predict the long-term kidney-related outcome of patients.

Methods: A retrospective cohort of 1,301 patients with biopsy-proven IgAN from two tertiary hospitals was used to derive and externally validate a random forest-based prediction model predicting primary outcome (30% decline in estimated glomerular filtration rate from baseline or end-stage kidney disease requiring renal replacement therapy) and secondary outcome (improvement of proteinuria) within 2 years after kidney biopsy.

Results: For the 2-year prediction of primary outcomes, precision, recall, area-under-the-curve, precision-recall-curve, F1, and Brier score were 0.259, 0.875, 0.771, 0.242, 0.400, and 0.309, respectively. The values for the secondary outcome were 0.904, 0.971, 0.694, 0.903, 0.955, and 0.113, respectively. From Shapley Additive exPlanations analysis, the most informative feature identifying both outcomes was baseline proteinuria. When Kaplan-Meier analysis for 10-year kidney outcome risk was performed with three groups by predicting probabilities derived from the 2-year primary outcome prediction model (low, moderate, and high), high (hazard ratio [HR], 13.00; 95% confidence interval [CI], 9.52-17.77) and moderate (HR, 12.90; 95% CI, 9.92-16.76) groups showed higher risks compared with the low group. From the 2-year secondary outcome prediction model, low (HR, 1.66; 95% CI, 1.42-1.95) and moderate (HR, 1.42; 95% CI, 0.99-2.03) groups were at greater risk for 10-year prognosis than the high group.

Conclusion: Our machine learning-based 2-year risk prediction models for the progression of IgAN showed reliable performance and effectively predicted long-term kidney outcome.

Keywords: Chronic kidney failure; Immunoglobulin A nephropathy; Machine learning; Proteinuria.