Digital mammography dataset for breast cancer diagnosis research (DMID) with breast mass segmentation analysis

Parita Oza; Urvi Oza; Rajiv Oza; Paawan Sharma; Samir Patel; Pankaj Kumar; Bakul Gohel

doi:10.1007/s13534-023-00339-y

Digital mammography dataset for breast cancer diagnosis research (DMID) with breast mass segmentation analysis

Biomed Eng Lett. 2023 Dec 21;14(2):317-330. doi: 10.1007/s13534-023-00339-y. eCollection 2024 Mar.

Authors

Parita Oza¹, Urvi Oza², Rajiv Oza³, Paawan Sharma⁴, Samir Patel⁴, Pankaj Kumar¹, Bakul Gohel²

Affiliations

¹ Nirma University, Ahmedabad, India.
² Dhirubhai Ambani Institute of Information and Communication Technology, Gandhinagar, India.
³ Rad Imaging, X-Ray and Sonography Clinic, Ahmedabad, India.
⁴ Pandit Deendayal Energy University, Gandhinagar, India.

Abstract

Purpose:In the last two decades, computer-aided detection and diagnosis (CAD) systems have been created to help radiologists discover and diagnose lesions observed on breast imaging tests. These systems can serve as a second opinion tool for the radiologist. However, developing algorithms for identifying and diagnosing breast lesions relies heavily on mammographic datasets. Many existing databases do not consider all the needs necessary for research and study, such as mammographic masks, radiology reports, breast composition, etc. This paper aims to introduce and describe a new mammographic database. Methods:The proposed dataset comprises mammograms with several lesions, such as masses, calcifications, architectural distortions, and asymmetries. In addition, a radiologist report is provided, describing the details of the breast, such as breast density, description of abnormality present, condition of the skin, nipple and pectoral muscles, etc., for each mammogram. Results:We present results of commonly used segmentation framework trained on our proposed dataset. We used information regarding the class of abnormalities (benign or malignant) and breast tissue density provided with each mammogram to analyze the segmentation model's performance concerning these parameters. Conclusion:The presented dataset provides diverse mammogram images to develop and train models for breast cancer diagnosis applications.

Keywords: Breast mass segmentation; Dataset; Deep learning; Mammogram.

© Korean Society of Medical and Biological Engineering 2023. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.