Background and Purpose- We evaluated deep learning algorithms' segmentation of acute ischemic lesions on heterogeneous multi-center clinical diffusion-weighted magnetic resonance imaging (MRI) data sets and explored the potential role of this tool for phenotyping acute ischemic stroke. Methods- Ischemic stroke data sets from the MRI-GENIE (MRI-Genetics Interface Exploration) repository consisting of 12 international genetic research centers were retrospectively analyzed using an automated deep learning segmentation algorithm consisting of an ensemble of 3-dimensional convolutional neural networks. Three ensembles were trained using data from the following: (1) 267 patients from an independent single-center cohort, (2) 267 patients from MRI-GENIE, and (3) mixture of (1) and (2). The algorithms' performances were compared against manual outlines from a separate 383 patient subset from MRI-GENIE. Univariable and multivariable logistic regression with respect to demographics, stroke subtypes, and vascular risk factors were performed to identify phenotypes associated with large acute diffusion-weighted MRI volumes and greater stroke severity in 2770 MRI-GENIE patients. Stroke topography was investigated. Results- The ensemble consisting of a mixture of MRI-GENIE and single-center convolutional neural networks performed best. Subset analysis comparing automated and manual lesion volumes in 383 patients found excellent correlation (ρ=0.92; P<0.0001). Median (interquartile range) diffusion-weighted MRI lesion volumes from 2770 patients were 3.7 cm3 (0.9-16.6 cm3). Patients with small artery occlusion stroke subtype had smaller lesion volumes ( P<0.0001) and different topography compared with other stroke subtypes. Conclusions- Automated accurate clinical diffusion-weighted MRI lesion segmentation using deep learning algorithms trained with multi-center and diverse data is feasible. Both lesion volume and topography can provide insight into stroke subtypes with sufficient sample size from big heterogeneous multi-center clinical imaging phenotype data sets.
Keywords: diffusion magnetic resonance imaging; machine learning; phenotype; risk factors; stroke.