Purpose: To evaluate the impact of domain adaptation on the performance of language models in predicting five-point Deauville scores on the basis of clinical fluorine 18 fluorodeoxyglucose PET/CT reports.
Materials and methods: The authors retrospectively retrieved 4542 text reports and images for fluorodeoxyglucose PET/CT lymphoma examinations from 2008 to 2018 in the University of Wisconsin-Madison institutional clinical imaging database. Of these total reports, 1664 had Deauville scores that were extracted from the reports and served as training labels. The bidirectional encoder representations from transformers (BERT) model and initialized BERT models BioClinicalBERT, RadBERT, and RoBERTa were adapted to the nuclear medicine domain by pretraining using masked language modeling. These domain-adapted models were then compared with the non-domain-adapted versions on the task of five-point Deauville score prediction. The language models were compared against vision models, multimodal vision-language models, and a nuclear medicine physician, with sevenfold Monte Carlo cross-validation. Means and SDs for accuracy are reported, with P values from paired t testing.
Results: Domain adaptation improved the performance of all language models (P = .01). For example, BERT improved from 61.3% ± 2.9 (SD) five-class accuracy to 65.7% ± 2.2 (P = .01) following domain adaptation. Domain-adapted RoBERTa (named DA RoBERTa) performed best, achieving 77.4% ± 3.4 five-class accuracy; this model performed similarly to its multimodal counterpart (named Multimodal DA RoBERTa) (77.2% ± 3.2) and outperformed the best vision-only model (48.1% ± 3.5, P ≤ .001). A physician given the task on a subset of the data had a five-class accuracy of 66%.
Conclusion: Domain adaptation improved the performance of large language models in predicting Deauville scores in PET/CT reports.Keywords Lymphoma, PET, PET/CT, Transfer Learning, Unsupervised Learning, Convolutional Neural Network (CNN), Nuclear Medicine, Deauville, Natural Language Processing, Multimodal Learning, Artificial Intelligence, Machine Learning, Language Modeling Supplemental material is available for this article. © RSNA, 2023See also the commentary by Abajian in this issue.
Keywords: Artificial Intelligence; Convolutional Neural Network (CNN); Deauville; Language Modeling; Lymphoma; Machine Learning; Multimodal Learning; Natural Language Processing; Nuclear Medicine; PET; PET/CT; Transfer Learning; Unsupervised Learning.
© 2023 by the Radiological Society of North America, Inc.