Evaluating the dosimetric impact of deep-learning-based auto-segmentation in prostate cancer radiotherapy: Insights into real-world clinical implementation and inter-observer variability

J Appl Clin Med Phys. 2024 Dec 1:e14569. doi: 10.1002/acm2.14569. Online ahead of print.

Abstract

Purpose: This study aimed to investigate the dosimetric impact of deep-learning-based auto-contouring for clinical target volume (CTV) and organs at risk (OARs) delineation in prostate cancer radiotherapy planning. Additionally, we compared the geometric accuracy of auto-contouring system to the variability observed between human experts.

Methods: We evaluated 28 planning CT volumes, each with three contour sets: reference original contours (OC), auto-segmented contours (AC), and expert-defined manual contours (EC). We generated 3D-CRT and intensity-modulated radiation therapy (IMRT) plans for each contour set and compared their dosimetric characteristics using dose-volume histograms (DVHs), homogeneity index (HI), conformity index (CI), and gamma pass rate (3%/3 mm).

Results: The geometric differences between automated contours and both their original manual reference contours and a second set of manually generated contours are smaller than the differences between two manually contoured sets for bladder, right femoral head (RFH), and left femoral head (LFH) structures. Furthermore, dose distribution accuracy using planning target volumes (PTVs) derived from automatically contoured CTVs and auto-contoured OARs demonstrated consistency with plans based on reference contours across all evaluated cases for both 3D-CRT and IMRT plans. For example, in IMRT plans, the average D95 for PTVs was 77.71 ± 0.53 Gy for EC plans, 77.58 ± 0.69 Gy for OC plans, and 77.62 ± 0.38 Gy for AC plans. Automated contouring significantly reduced contouring time, averaging 0.53 ± 0.08 min compared to 24.9 ± 4.5 min for manual delineation.

Conclusion: Our automated contouring system can reduce inter-expert variability and achieve dosimetric accuracy comparable to gold standard reference contours, highlighting its potential for streamlining clinical workflows. The quantitative analysis revealed no consistent trend of increasing or decreasing PTVs derived from automatically contoured CTVs and OAR doses due to automated contours, indicating minimal impact on treatment outcomes. These findings support the clinical feasibility of utilizing our deep-learning-based auto-contouring model for prostate cancer radiotherapy planning.

Keywords: auto‐contouring; deep learning; inter‐observer variability; prostate segmentation; radiotherapy treatment planning.