Interobserver agreement analysis among renal pathologists in classification of lupus nephritis using a digital pathology image dataset: after a third evaluation

Ju Yeon Pyo; Nara Jeon; Su-Jin Shin; Minsun Jung; Beom Jin Lim; Minseob Eom; Sung-Eun Choi; Renal Pathology Study Group of Korean Society of Pathologists

doi:10.23876/j.krcp.24.185

Interobserver agreement analysis among renal pathologists in classification of lupus nephritis using a digital pathology image dataset: after a third evaluation

Kidney Res Clin Pract. 2024 Dec 11. doi: 10.23876/j.krcp.24.185. Online ahead of print.

Authors

Ju Yeon Pyo¹, Nara Jeon², Su-Jin Shin³, Minsun Jung², Beom Jin Lim², Minseob Eom⁴, Sung-Eun Choi⁵; Renal Pathology Study Group of Korean Society of Pathologists

Affiliations

¹ Department of Pathology, Catholic Kwandong University International St. Mary's Hospital, Incheon, Republic of Korea.
² Department of Pathology, Yonsei University College of Medicine, Seoul, Republic of Korea.
³ Department of Pathology, Gangnam Severance Hospital, Yonsei University College of Medicine, Seoul, Republic of Korea.
⁴ Department of Pathology, Wonju Severance Christian Hospital, Yonsei University Wonju College of Medicine, Wonju, Republic of Korea.
⁵ Department of Pathology, CHA Bundang Medical Center, CHA University, Seongnam, Republic of Korea.

PMID: 39676523
DOI: 10.23876/j.krcp.24.185

Abstract

Background: Lupus nephritis is well-known for low concordance in classification. Furthermore, there has been no agreement analysis among Korean renal pathologists regarding lupus nephritis. Inconsistent diagnosis leads to confusion and increases medical costs, as well as failure of appropriate therapeutic interventions. This study aimed to assess the level of agreement among Korean renal pathologists regarding classification.

Methods: Representative glomerular images from patients diagnosed with lupus nephritis were obtained from five hospitals. Twenty-five questions were formulated, and multiple-choice questions with 14 options, consisting of characteristic histopathological findings of lupus nephritis were provided. Three rounds of surveys were conducted and educational sessions were conducted before the second and third surveys.

Results: The agreement was calculated using Fleiss' κ and the means for each round of questions were as follows: Survey 1, 0.42 (range, 0.18-0.61), Survey 2, 0.42 (range, 0.19-0.64), and Survey 3, 0.47 (range, 0.23-0.65). Although κ after the first education session showed no significant difference compared to the initial κ (p = 0.95), after the second education session, κ increased significantly compared to the initial κ (p < 0.001). The κ for each item generally increased with each education session, but they were not statistically significant (p = 0.46, p = 0.17). Additionally, the rankings of agreement, for each item, were relatively consistent.

Conclusion: This study conducted an interobserver agreement analysis of Korean pathologists for lupus nephritis, with the goal of increasing agreement through education. Although the education increased overall agreement, items like "mesangial hypercellularity," "endocapillary hypercellularity," and "neutrophils and/or karyorrhexis" remained inconsistent attributable to innate subjectivity and ineffective education.

Keywords: Classification; Digital technology; Lupus nephritis; Observer variation; Pathology.