A Probabilistic Model to Support Radiologists' Classification Decisions in Mammography Practice

Jiaming Zeng; Francisco Gimenez; Elizabeth S Burnside; Daniel L Rubin; Ross Shachter

doi:10.1177/0272989X19832914

A Probabilistic Model to Support Radiologists' Classification Decisions in Mammography Practice

Med Decis Making. 2019 Apr;39(3):208-216. doi: 10.1177/0272989X19832914. Epub 2019 Feb 28.

Authors

Jiaming Zeng¹, Francisco Gimenez², Elizabeth S Burnside³, Daniel L Rubin², Ross Shachter¹

Affiliations

¹ Stanford University School of Engineering, Stanford, CA, USA.
² Stanford University School of Medicine (Department of Biomedical Data Science, Radiology, and Medicine), CA, USA.
³ University of Wisconsin Madison School of Medicine and Public Health, Madison, WI, USA.

Abstract

We developed a probabilistic model to support the classification decisions made by radiologists in mammography practice. Using the feature observations and Breast Imaging Reporting and Data System (BI-RADS) classifications from radiologists examining diagnostic and screening mammograms, we modeled their decisions to understand their judgments. Our model could help improve the decisions made by radiologists using their own feature observations and classifications while maintaining their observed sensitivities. Based on 112,433 mammographic cases from 36,111 patients and 13 radiologists at 2 separate institutions with a 1.1% prevalence of malignancy, we trained a probabilistic Bayesian network (BN) to estimate the malignancy probabilities of lesions. For each radiologist, we learned an observed probabilistic threshold within the model. We compared the sensitivity and specificity of each radiologist against the BN model using either their observed threshold or the standard 2% threshold recommended by BI-RADS. We found significant variability among the radiologists' observed thresholds. By applying the observed thresholds, the BN model showed a 0.01% (1 case) increase in false negatives and a 28.9% (3612 cases) reduction in false positives. When using the standard 2% BI-RADS-recommended threshold, there was a 26.7% (47 cases) increase in false negatives and a 47.3% (5911 cases) reduction in false positives. Our results show that we can significantly reduce screening mammography false positives with a minimal increase in false negatives. We find that learning radiologists' observed thresholds provides valuable information regarding the conservativeness of clinical practice and allows us to quantify the variability in sensitivity across and within institutions. Our model could provide support to radiologists to improve their performance and consistency within mammography practice.

Keywords: classification decision; decision support; mammography; observed threshold.

Publication types

Research Support, N.I.H., Extramural

MeSH terms

Bayes Theorem
Breast Neoplasms / diagnosis
Breast Neoplasms / pathology
Clinical Competence / standards
Decision Making*
Early Detection of Cancer / standards
Humans
Mammography / classification*
Mammography / standards
Models, Statistical
Radiologists / psychology
Radiologists / standards*
Radiologists / statistics & numerical data
Sensitivity and Specificity

Abstract

Publication types

MeSH terms

Grants and funding