We developed a probabilistic model to support the classification decisions made by radiologists in mammography practice. Using the feature observations and Breast Imaging Reporting and Data System (BI-RADS) classifications from radiologists examining diagnostic and screening mammograms, we modeled their decisions to understand their judgments. Our model could help improve the decisions made by radiologists using their own feature observations and classifications while maintaining their observed sensitivities. Based on 112,433 mammographic cases from 36,111 patients and 13 radiologists at 2 separate institutions with a 1.1% prevalence of malignancy, we trained a probabilistic Bayesian network (BN) to estimate the malignancy probabilities of lesions. For each radiologist, we learned an observed probabilistic threshold within the model. We compared the sensitivity and specificity of each radiologist against the BN model using either their observed threshold or the standard 2% threshold recommended by BI-RADS. We found significant variability among the radiologists' observed thresholds. By applying the observed thresholds, the BN model showed a 0.01% (1 case) increase in false negatives and a 28.9% (3612 cases) reduction in false positives. When using the standard 2% BI-RADS-recommended threshold, there was a 26.7% (47 cases) increase in false negatives and a 47.3% (5911 cases) reduction in false positives. Our results show that we can significantly reduce screening mammography false positives with a minimal increase in false negatives. We find that learning radiologists' observed thresholds provides valuable information regarding the conservativeness of clinical practice and allows us to quantify the variability in sensitivity across and within institutions. Our model could provide support to radiologists to improve their performance and consistency within mammography practice.
Keywords: classification decision; decision support; mammography; observed threshold.