Background and purpose: To date, only a few small studies have attempted deep learning-based automatic segmentation of white matter hyperintensity (WMH) lesions in patients with cerebral infarction; this issue is complicated because stroke-related lesions can obscure WMH borders. We developed and validated deep learning algorithms to segment WMH lesions accurately in patients with cerebral infarction using multisite data sets involving 8421 patients with acute ischemic stroke.
Materials and methods: We included 8421 patients with stroke from 9 centers in Korea. 2D UNet and squeeze-and-excitation (SE)-UNet models were trained using 2408 FLAIR MRIs from 3 hospitals and validated using 6013 FLAIR MRIs from 6 hospitals. WMH segmentation performance was assessed by calculating the Dice similarity coefficient (DSC), the correlation coefficient, and the concordance correlation coefficient compared with a human-segmented criterion standard. In addition, we obtained an uncertainty index that represents overall ambiguity in the voxel classification for WMH segmentation in each patient based on the Kullback-Leibler divergence.
Results: In the training data set, the mean age was 67.4 (SD, 13.0) years, and 60.4% were men. The mean (95% CI) DSCs for UNet in internal testing and external validation were, respectively, 0.659 (0.649-0.669) and 0.710 (0.707-0.714), which were slightly lower than the reliability between humans (DSC = 0.744; 95% CI, 0.738-0.751; P = .031). Compared with the UNet, the SE-UNet demonstrated better performance, achieving a mean DSC of 0.675 (95% CI, 0.666-0.685; P < .001) in the internal testing and 0.722 (95% CI, 0.719-0.726; P < .001) in the external validation; moreover, it achieved high DSC values (ranging from 0.672 to 0.744) across multiple validation data sets. We observed a significant correlation between WMH volumes that were segmented automatically and manually for the UNet (r = 0.917, P < .001), and it was even stronger for the SE-UNet (r = 0.933, P < .001). The SE-UNet also attained a high concordance correlation coefficient (ranging from 0.841 to 0.956) in the external test data sets. In addition, the uncertainty indices in most patients (86%) in the external data sets were <0.35, with an average DSC of 0.744 in these patients.
Conclusions: We developed and validated deep learning algorithms to segment WMH in patients with acute cerebral infarction using the largest-ever MRI data sets. In addition, we showed that the uncertainty index can be used to identify cases in which automatic WMH segmentation is less accurate and requires human review.
© 2024 by American Journal of Neuroradiology.