Objective: To investigate integrating an artificial intelligence (AI) system into diagnostic breast ultrasound (US) for improved performance.
Materials and methods: Seventy suspicious breast mass lesions (53 malignant and 17 benign) from seventy women who underwent diagnostic breast US complemented with shear wave elastography, US-guided core needle biopsy and verified histopathology were enrolled. Two radiologists, one with 15 years of experience and the other with one year of experience, evaluated the images for breast imaging-reporting and data system (BI-RADS) scoring. The less-experienced radiologist re-evaluated the images with the guidance of a commercial AI system and the maximum elasticity from shear wave elastography. The BI-RADS scorings were processed to determine diagnostic performance and malignancy detections.
Results: The experienced reader demonstrated superior performance with an area under the curve (AUC) of 0.888 [95% confidence interval (CI): 0.793-0.983], indicating high diagnostic accuracy. In contrast, the Koios decision support (DS) system achieved an AUC of 0.693 (95% CI: 0.562-0.824). The less-experienced reader, guided by both Koios and elasticity, showed an AUC of 0.679 (95% CI: 0.534-0.823), while Koios alone resulted in an AUC of 0.655 (95% CI: 0.512-0.799). Without any guidance, the less-experienced reader exhibited the lowest performance, with an AUC of 0.512 (95% CI: 0.352-0.672). The experienced reader had a sensitivity of 98.1%, specificity of 58.8%, positive predictive value of 88.1%, negative predictive value of 90.9%, and overall accuracy of 88.6%. The Koios DS showed a sensitivity of 92.5%, specificity of 35.3%, and an accuracy of 78.6%. The less-experienced reader, when guided by both Koios and elasticity, achieved a sensitivity of 92.5%, specificity of 23.5%, and an accuracy of 75.7%. When guided by Koios alone, the less-experienced reader had a sensitivity of 90.6%, specificity of 17.6%, and an accuracy of 72.9%. Lastly, the less-experienced reader without any guidance showed a sensitivity of 84.9%, specificity of 17.6%, and an accuracy of 68.6%.
Conclusion: Diagnostic evaluation of the suspicious masses on breast US images largely depends on experience, with experienced readers showing good performances. AI-based guidance can help improve lower performances, and using the elasticity metric may further improve the performances of less experienced readers. This type of guidance may reduce unnecessary biopsies by increasing the detection rate for malignant lesions and deliver significant benefits for routine clinical practice in underserved areas where experienced readers may not be available.
Keywords: Breast cancer; artificial intelligence; breast ultrasound; elastography.
©Copyright 2025 by the Turkish Federation of Breast Diseases Societies / European Journal of Breast Health published by Galenos Publishing House.