Purpose: To assess whether subjective breast density categorization remains the most useful way to categorize mammographic breast density and whether variations exist across geographic regions with differing national legislation.
Methods: Breast radiologists from two countries (UK, USA) were voluntarily recruited to review sets of anonymized mammographic images (n = 180) and additional repeated images (n = 70), totaling 250 images, to subjectively rate breast density according to the Breast Imaging Reporting and Data system (BI-RADS) categorization. Images were reviewed using standardized viewing conditions and Ziltron software. Inter-rater reliability was analyzed using the Kappa test.
Results: The US radiologists (n = 25) judged fewer images as being "mostly fatty" than UK radiologists (n = 24), leading a greater number of images classified in the higher BI-RADS categories, particularly in BI-RADS 3. Overall agreement for all data sets was k = 0.654 indicating substantial agreement between the two cohorts. When the data were split into BI-RADS categories, the level of agreement varied from fair to substantial.
Conclusion: Variations in how radiologists from the USA and UK classify breast density was established, especially when the data were divided into breast density categories. This variation supports the need for a reliable breast density assessment method to enhance the individualized supplemental screening pathways for dense breasts. The use of two-scale categorization method demonstrated improved agreement.
Advances in knowledge: Larger sample of radiologists from different breast density jurisdictions confirms international subjective variability in density categorization and improved agreement with the two-scale (low, high) categorization. With this variability, a standardized and automated breast density assessment shows to be timely.
Keywords: BI-RADS; Breast density; intrarater variability; mammography.
Copyright © 2019. Published by Elsevier Inc.