Background & objective: Automatic lesion segmentation techniques on MRI scans of people with multiple sclerosis (pwMS) could support lesion detection and segmentation in trials and clinical practice. However, knowledge on their reliability across scanners is limited, hampering clinical implementation. The aim of this study was to investigate the within-scanner repeatability and between-scanner reproducibility of lesion segmentation tools in pwMS across three different scanners and examine their accuracy compared to manual segmentations with and without optimization.
Methods: 30 pwMS underwent a scan and rescan on three MRI scanners. GE Discovery MR750 (3.0 T), Siemens Sola (1.5 T) and Siemens Vida (3.0 T)). 3D-FLuid Attenuated Inversion Recovery (3D-FLAIR) and 3D T1-weighted scans were acquired on each scanner. Lesion segmentation involved preprocessing and automatic segmentation using the Lesion Segmentation Toolbox (LST) and nicMSlesions (nicMS) as well as manual segmentation. Both automated segmentation techniques were used with default settings, and with settings optimized to match manual segmentations for each scanner specifically and combined for the three scanners. LST settings were optimized by adjusting the threshold to improve the Dice similarity coefficient (DSC) for each scanner separately and a combined threshold for all scanners. For nicMS the last layers were retrained, once with the multi-scanner data to represent a combined optimization and once separately for each scanner for scanner specific optimization. Volumes and counts were extracted. DSC was calculated for accuracy, and reliability was assessed using intra-class correlation coefficients (ICC). Differences in DSC between software was tested with a repeated measures ANOVA and when appropriate post-hoc paired t-tests using Bonferroni correction.
Results: Scanner-specific optimization significantly improved DSC for LST compared to default and combined settings, except for the GE scanner. NicMS showed significantly higher DSC for both the scanner-specific and combined optimization than default. Within-scanner repeatability was excellent (ICC>0.9) for volume and counts. Between-scanner ICC for volume between Vida and Sola was higher (0.94-0.99) than between GE MR750 and Vida or Sola (0.18-0.93), with improved ICCs for nicMS scanner-specific (0.87-0.93) compared to others (0.18-0.79). This was not present for Sola vs. Vida where all ICCs were excellent (>0.94).
Conclusion: Scanner-specific optimization strategies proved effective in mitigating inter-scanner variability, addressing the issue of insufficient reproducibility and accuracy found with default settings.
Keywords: Accuracy; Lesion segmentation; Multiple sclerosis; Reliability.
Copyright © 2024 The Author(s). Published by Elsevier Inc. All rights reserved.