Measuring up: Ensuring Intra- and Interobserver Reliability in Stretched Penile Length with the SPLINT Technique

J Indian Assoc Pediatr Surg. 2024 Nov-Dec;29(6):579-588. doi: 10.4103/jiaps.jiaps_107_24. Epub 2024 Nov 5.

Abstract

Background: A discrepancy between the true and measured value of stretched penile length (SPL) may be a result of errors that can either be systematic or random. Hence, it becomes important to focus on the quality of measurements to prevent any iatrogenic harm to the patients.

Objective: The objective of this study was to assess the magnitude of intra- and interobserver variations in the measurement of SPL with the SPLINT technique.

Materials and methods: SPL was measured prospectively in a cohort of 449 boys aged 0-14 years including 68 infants (substratified into Group I: >4 years, Group II: 4-8 years, and Group III: >8 years) with the SPLINT technique by expert (E: E1 and E2) and trainee (T: T1 and T2) surgeons after completing a three-tiered training module. Intra- and interobserver variability was assessed through descriptive statistics, intraclass correlation (ICC), relative technical error of measurement (rTEM), and reliability or R (%).

Results: Intraobserver variability: the mean difference between the two readings (E1 and E2) is 0.08 cm (95% confidence interval [CI]: 0.073-0.087), ICC was 0.998 (95% CI: 0.997-0.998), and intraobserver variability ≤0.1 cm in 85% of the participants (n = 370 of 433). The rTEM and reliability (%) were 1.82% and 98.1% (Group I), 1.65% and 98.9% (Group II), and 1.09% and 99.7% (Group III), respectively. The intraobserver variability was observed to be inversely proportional to the age of the participants (correlation coefficient = -0.56). Interobserver variability was calculated separately for expert versus trainee and trainee versus trainee (T-vs-T) measurements. For expert versus expert, ICC, rTEM, and reliability (%) were 0.984, 2.4%, and 96.8% (Group 1), 0.992, 2.07%, and 98.3% (Group 2), and 0.997, 1.38%, and 99.05% (Group 3), respectively. A similar pattern of variability was observed for T-vs-T measurements. The reliability (%) of the SPL by experts is consistently more than that of trainees across all age groups; however, the difference ameliorates with the age of participant.

Conclusions: The study has validated the SPLINT technique by demonstrating a high level of intra- and interobserver reliability. The adequacy of the training modules for SPL measurements described in this study has also been established. Evidence that the SPL can be used as an objective marker of penile dimensions is herewith furnished.

Keywords: Interclass correlation; SPLINT technique; interobserver variability; intraobserver variability; relative technical error of measurement; reliability; stretched penile length.