Generalizable self-supervised learning for brain CTA in acute stroke

Yingjun Dong; Samiksha Pachade; Kirk Roberts; Xiaoqian Jiang; Sunil A Sheth; Luca Giancardo

doi:10.1016/j.compbiomed.2024.109337

Generalizable self-supervised learning for brain CTA in acute stroke

Comput Biol Med. 2024 Nov 12:184:109337. doi: 10.1016/j.compbiomed.2024.109337. Online ahead of print.

Authors

Yingjun Dong¹, Samiksha Pachade¹, Kirk Roberts¹, Xiaoqian Jiang², Sunil A Sheth³, Luca Giancardo⁴

Affiliations

¹ McWilliams School of Biomedical Informatics at the University of Texas Health Science Center at Houston, Houston, TX 77030, USA.
² McWilliams School of Biomedical Informatics at the University of Texas Health Science Center at Houston, Houston, TX 77030, USA; Institute for Stroke and Cerebrovascular Diseases, University of Texas Health Science Center at Houston, Houston, TX 77030, USA.
³ Department of Neurology at McGovern Medical School, University of Texas Health Science Center at Houston, Houston, TX 77030, USA.
⁴ McWilliams School of Biomedical Informatics at the University of Texas Health Science Center at Houston, Houston, TX 77030, USA; Institute for Stroke and Cerebrovascular Diseases, University of Texas Health Science Center at Houston, Houston, TX 77030, USA. Electronic address: [email protected].

PMID: 39536386
DOI: 10.1016/j.compbiomed.2024.109337

Abstract

Acute stroke management involves rapid and accurate interpretation of CTA imaging data. However, generalizable models for multiple acute stroke tasks able to learn from unlabeled data do not exist. We propose a linear probed self-supervised contrastive learning utilizing 3D CTA images and the findings section of radiologists' reports for pretraining. Subsequently, the pretrained model was applied to four disparate tasks: large vessel occlusion (LVO) detection, acute ischemic stroke detection, acute ischemic stroke, intracerebral hemorrhage classification, and ischemic core volume prediction. The tasks chosen are particularly challenging as they cannot be directly extracted from the radiology reports findings with keywords. The difficulty is compounded by the 3D feature representation required by tasks such as LVO detection. All imaging models were trained from scratch. In the pretraining phase, our dataset comprised 1,542 pairs of 3D CTA brains and corresponding radiologists' reports from 3 sites without any additional labels. To test the generalizability, we performed fine-tuning and testing phase with labeled data from another site on CTA brains from 592 subjects. In our experiments, we evaluated the influence of linear probing during the pretraining phase and found that, on average, it enhanced our model's generalizability, as shown by the improved classification performance with the appropriate text encoder. Our findings indicate that the best-performing models exhibit robust generalization to out-of-distribution data for multiple tasks. In all scenarios, linear probing during pretraining yielded superior predictive performance compared to a standard strategy. Furthermore, pretraining with reports findings conferred significant performance advantages compared to training the imaging encoder solely on labeled data.

Keywords: Acute stroke; Multi-task; Self-supervised learning.