Large multimodality model fine-tuned for detecting breast and esophageal carcinomas on CT: a preliminary study

Jpn J Radiol. 2024 Dec 13. doi: 10.1007/s11604-024-01718-w. Online ahead of print.

Abstract

Purpose: This study aimed to develop a large multimodality model (LMM) that can detect breast and esophageal carcinomas on chest contrast-enhanced CT.

Materials and methods: In this retrospective study, CT images of 401 (age, 62.9 ± 12.9 years; 169 males), 51 (age, 65.5 ± 11.6 years; 23 males), and 120 (age, 64.6 ± 14.2 years; 60 males) patients were used in the training, validation, and test phases. The numbers of CT images with breast carcinoma, esophageal carcinoma, and no lesion were 927, 2180, and 2087; 80, 233, and 270; and 184, 246, and 6919 for the training, validation, and test datasets, respectively. The LMM was fine-tuned using CT images as input and text data ("suspicious of breast carcinoma"/ "suspicious of esophageal carcinoma"/ "no lesion") as reference data on a desktop computer equipped with a single graphic processing unit. Because of the random nature of the training process, supervised learning was performed 10 times. The performance of the best performing model on the validation dataset was further tested using the time-independent test dataset. The detection performance was evaluated by calculating the area under the receiver operating characteristic curve (AUC).

Results: The sensitivities of the fine-tuned LMM for detecting breast and esophageal carcinomas in the test dataset were 0.929 and 0.951, respectively. The diagnostic performance of the fine-tuned LMM for detecting breast and esophageal carcinomas was high, with AUCs of 0.890 (95%CI 0.871-0.909) and 0.880 (95%CI 0.865-0.894), respectively.

Conclusions: The fine-tuned LMM could detect both breast and esophageal carcinomas on chest contrast-enhanced CT with high diagnostic performance. Usefulness of large multimodality models in chest cancer imaging has not been assessed so far. The fine-tuned large multimodality model could detect breast and esophageal carcinomas with high diagnostic performance (area under the receiver operating characteristic curve of 0.890 and 0.880, respectively).

Keywords: Breast carcinoma; Computed tomography; Deep learning; Esophageal carcinoma; Large multimodality model.