Benchmarking of Deformable Image Registration for Multiple Anatomic Sites Using Digital Data Sets With Ground-Truth Deformation Vector Fields

Pract Radiat Oncol. 2021 Sep-Oct;11(5):404-414. doi: 10.1016/j.prro.2021.02.012. Epub 2021 Mar 17.

Abstract

Purpose: This study aimed to evaluate the accuracy of deformable image registration (DIR) algorithms using data sets with different levels of ground-truth deformation vector fields (DVFs) and to investigate the correlation between DVF errors and contour-based metrics.

Methods and materials: Nine pairs of digital data sets were generated through contour-controlled deformations based on 3 anonymized patients' CTs (head and neck, thorax/abdomen, and pelvis) with low, medium, and high deformation intensity for each site using the ImSimQA software. Image pairs and their associated contours were imported to MIM-Maestro, Raystation, and Velocity systems, followed by DIR and contour propagation. The system-generated DVF and propagated contours were compared with the ground-truth data. The correlation between DVF errors and contour-based metrics was evaluated using the Pearson correlation coefficient (r), while their correlation with volumes were calculated using Spearman correlation coefficient (rho).

Results: The DVF errors increased with increasing deformation intensity. All DIR algorithms performed well for esophagus, trachea, left femoral, right femoral, and urethral (mean and maximum DVF errors <2.50 mm and <4.27 mm, respectively; Dice similarity coefficient: 0.93-0.99). Brain, liver, left lung, and bladder showed large DVF errors for all 3 systems (dmax: 2.8-91.90 mm). The minimum and maximum DVF errors, conformity index, and Dice similarity coefficient were correlated with volumes (|rho|: 0.41-0.64), especially for very large or small structures (|rho|: 0.64-0.80). Only mean distance to agreement of Raystation and Velocity correlated with some indices of DVF errors (r: 0.70-0.78).

Conclusions: Most contour-based metrics had no correlation with DVF errors. For adaptive radiation therapy, well-performed contour propagation does not directly indicate accurate dose deformation and summation/accumulation within each contour (determined by DVF accuracy). Tolerance values for DVF errors should vary as the acceptable accuracy for overall adaptive radiation therapy depends on anatomic site, deformation intensity, organ size, and so forth. This study provides benchmark tables for evaluating DIR accuracy in various clinical scenarios.

MeSH terms

  • Algorithms
  • Benchmarking*
  • Head
  • Humans
  • Image Processing, Computer-Assisted*
  • Male
  • Software