Background: The classification of Breast Imaging Reporting and Data System (BI-RADS) category 4A lesions in mammography is complicated by subjective interpretations and unclear criteria, which can lead to potential misclassifications and unnecessary biopsies. Thus, more accurate assessment methods need to be developed. This study aimed to improve the classification prediction of BI-RADS 4A positive lesions in mammography by combining deep learning (DL) technology with relevant clinical factors.
Methods: A retrospective analysis of 590 patients diagnosed with BI-RADS 4A at Shenzhen People's Hospital and Shenzhen Luohu People's Hospital was conducted, and a multi-faceted approach was employed to construct a robust predictive model. The patients were divided into training, validation, and external validation sets. The classification results from a DL system applied to mammography were recorded, and data on relevant clinical factors were collected. Univariate and multivariate logistic regression analyses were performed to identify the independent predictive factors. A predictive model and nomogram integrating these factors were developed. Assessment metrics, such as the areas under the curve (AUCs), calibration curves, and a decision curve analysis (DCA), were employed to evaluate the diagnostic performance, calibration, and clinical net benefit of the model. External validation was conducted to assess the generalization ability of the model.
Results: Four independent predictive factors (i.e., age, nipple discharge, ultrasound BI-RADS assessment, and DL system classification results) were identified and included in the predictive model. The model showed commendable diagnostic performance with AUC values of 0.85, 0.82, and 0.84 for the training, validation, and external validation sets, respectively. There were no statistically significant differences in the AUCs of the predictive model between the training set, and the internal and external validation sets (P=0.543 and 0.842, respectively). The calibration curves showed excellent calibration in the training, validation, and external validation sets, indicating a minimal deviation between the predicted and actual positive risk probabilities (P=0.906, 0.890, and 0.769, respectively). The DCA results illustrated the clinical net benefit of the model for risk thresholds greater than 0.15 and less than 0.70 in both the internal validation and external validation sets.
Conclusions: Our predictive model, which incorporated age, nipple discharge, ultrasound BI-RADS assessment, and DL system classification results, emerged as a powerful tool for accurately predicting BI-RADS 4A positive lesions. Its application holds significant promise in helping radiologists enhance diagnostic precision and reduce unnecessary biopsies in BI-RADS 4A positive lesion cases.
Keywords: Breast cancer; artificial intelligence; deep learning (DL); mammography; ultrasound.
2024 AME Publishing Company. All rights reserved.