Interobserver reliability and reproducibility of s.T.o.N.e. Nephrolithometry for renal calculi

Zhamshid Okhunov; Mohammad Helmy; Alberto Perez-Lansac; Ashleigh Menhadji; Philip Bucur; Surendra B Kolla; Jane S Cho; Kathy Osann; Achim Lusch; Jaime Landman

doi:10.1089/end.2013.0289

Interobserver reliability and reproducibility of s.T.o.N.e. Nephrolithometry for renal calculi

J Endourol. 2013 Oct;27(10):1303-6. doi: 10.1089/end.2013.0289. Epub 2013 Aug 21.

Authors

Zhamshid Okhunov¹, Mohammad Helmy, Alberto Perez-Lansac, Ashleigh Menhadji, Philip Bucur, Surendra B Kolla, Jane S Cho, Kathy Osann, Achim Lusch, Jaime Landman

Affiliation

¹ 1 Department of Urology, University of California , Irvine, Orange, California.

PMID: 23815088
DOI: 10.1089/end.2013.0289

Abstract

Purpose: To assess the reliability of the S.T.O.N.E. (stone size [S], tract length [T], obstruction [O], number of involved calices [N], and essence or stone density [E]) nephrolithometry scoring system by testing its reproducibility between different observers.

Patients and methods: Preoperative images of 58 patients who underwent percutaneous nephrolithotomy (PCNL) were reviewed. Medical students, urology residents, one fellow, and a urology attending independently reviewed all images and scored the renal stones. Interobserver reliabilities of the total score for all categories and each component were evaluated by the intraclass correlation (ICC) and a κ coefficient.

Results: The interobserver reliability for the total score demonstrated high correlations for all components and total score (ICC=S, T, O, N, E and total 0.80, 0.97, 0.89, 0.84, 0.91, and 0.87, respectively). κ rates for individual components between two medical students were 0.36, 1, 0.31, 0.45, 0.33, and 0.30 for the S, T, O, N, E components and total score, respectively. κ values between the two urology residents were 0.71, 1, 0.92, 0.79, 0.93, and 0.67 for S, T, O, N, E components and total score, respectively. κ values between the urology fellow and an attending physician were 0.95, 1, 0.88, 0.94, 0.89, and 0.87 for S, T, O, N, E components and total score, respectively. P value for all the scoring components was <0.05, indicating that the estimated κ was not a result of chance.

Conclusions: The S.T.O.N.E. nephrolithometry has excellent interobserver reliability. Quantifying the S and N metrics was the most challenging and least reliable. Standardized protocols to measure these components should be considered to improve accuracy and reproducibility of the scoring system.

MeSH terms

Humans
Kidney Calculi / classification*
Kidney Calculi / diagnostic imaging
Kidney Calculi / pathology*
Kidney Calculi / surgery
Nephrostomy, Percutaneous
Observer Variation
Reproducibility of Results
Tomography, X-Ray Computed