Purpose: To determine whether a Bayesian network trained on a large database of patient demographic risk factors and radiologist-observed findings from consecutive clinical mammography examinations can exceed radiologist performance in the classification of mammographic findings as benign or malignant.
Materials and methods: The institutional review board exempted this HIPAA-compliant retrospective study from requiring informed consent. Structured reports from 48 744 consecutive pooled screening and diagnostic mammography examinations in 18 269 patients from April 5, 1999 to February 9, 2004 were collected. Mammographic findings were matched with a state cancer registry, which served as the reference standard. By using 10-fold cross validation, the Bayesian network was tested and trained to estimate breast cancer risk by using demographic risk factors (age, family and personal history of breast cancer, and use of hormone replacement therapy) and mammographic findings recorded in the Breast Imaging Reporting and Data System lexicon. The performance of radiologists compared with the Bayesian network was evaluated by using area under the receiver operating characteristic curve (AUC), sensitivity, and specificity.
Results: The Bayesian network significantly exceeded the performance of interpreting radiologists in terms of AUC (0.960 vs 0.939, P = .002), sensitivity (90.0% vs 85.3%, P < .001), and specificity (93.0% vs 88.1%, P < .001).
Conclusion: On the basis of prospectively collected variables, the evaluated Bayesian network can predict the probability of breast cancer and exceed interpreting radiologist performance. Bayesian networks may help radiologists improve mammographic interpretation.