Acute stroke management involves rapid and accurate interpretation of CTA imaging data. However, generalizable models for multiple acute stroke tasks able to learn from unlabeled data do not exist. We propose a linear probed self-supervised contrastive learning utilizing 3D CTA images and the findings section of radiologists' reports for pretraining. Subsequently, the pretrained model was applied to four disparate tasks: large vessel occlusion (LVO) detection, acute ischemic stroke detection, acute ischemic stroke, intracerebral hemorrhage classification, and ischemic core volume prediction. The tasks chosen are particularly challenging as they cannot be directly extracted from the radiology reports findings with keywords. The difficulty is compounded by the 3D feature representation required by tasks such as LVO detection. All imaging models were trained from scratch. In the pretraining phase, our dataset comprised 1,542 pairs of 3D CTA brains and corresponding radiologists' reports from 3 sites without any additional labels. To test the generalizability, we performed fine-tuning and testing phase with labeled data from another site on CTA brains from 592 subjects. In our experiments, we evaluated the influence of linear probing during the pretraining phase and found that, on average, it enhanced our model's generalizability, as shown by the improved classification performance with the appropriate text encoder. Our findings indicate that the best-performing models exhibit robust generalization to out-of-distribution data for multiple tasks. In all scenarios, linear probing during pretraining yielded superior predictive performance compared to a standard strategy. Furthermore, pretraining with reports findings conferred significant performance advantages compared to training the imaging encoder solely on labeled data.
Keywords: Acute stroke; Multi-task; Self-supervised learning.
Copyright © 2024 Elsevier Ltd. All rights reserved.