Multi-institutional Evaluation and Training of Breast Density Classification AI Algorithm Using ACR Connect and AI-LAB

J Am Coll Radiol. 2024 Nov 15:S1546-1440(24)00912-8. doi: 10.1016/j.jacr.2024.11.003. Online ahead of print.

Abstract

Objective: To demonstrate and test the capabilities of the American College of Radiology (ACR) Connect and AI-LAB software platform by implementing multi-institutional artificial intelligence (AI) training and validation for breast density classification.

Methods: In this proof-of-concept study, six U.S.-based hospitals installed Connect and AI-LAB. A breast density algorithm was trained and tested on retrospective mammograms. We recorded time to receive an IRB approval, to install software locally, and to complete the testing and training. We calculated the performance of the breast density algorithm at each participating hospital and compared it to the performance of a hold-out multi-institutional clinical trial testing dataset and a retrospective multi-institutional dataset. We calculated the performance of the locally fine-tuned models on the hold-out test datasets.

Results: The median time to receive IRB approval was 66 days, and the median time to successfully install Connect and AI-LAB locally was 157 days. The median time to complete breast density algorithm testing and training was 216 days. The breast density algorithm performed worse at each hospital than on the hold-out test dataset, suggesting poor generalizability of the base model. The fine-tuned models had mixed performance locally and performed poorly on the test dataset.

Discussion: In this study, we demonstrate the successful installation and implementation of Connect and AI-LAB software platforms at six facilities using a breast density algorithm. Our results suggest poor generalizability of an algorithm trained on a single dataset and algorithms fine-tuned at individual institutions, emphasizing the hypothetical importance of multi-institutional testing and training.