Digital mammography dataset for breast cancer diagnosis research (DMID) with breast mass segmentation analysis

Biomed Eng Lett. 2023 Dec 21;14(2):317-330. doi: 10.1007/s13534-023-00339-y. eCollection 2024 Mar.

Abstract

Purpose:In the last two decades, computer-aided detection and diagnosis (CAD) systems have been created to help radiologists discover and diagnose lesions observed on breast imaging tests. These systems can serve as a second opinion tool for the radiologist. However, developing algorithms for identifying and diagnosing breast lesions relies heavily on mammographic datasets. Many existing databases do not consider all the needs necessary for research and study, such as mammographic masks, radiology reports, breast composition, etc. This paper aims to introduce and describe a new mammographic database. Methods:The proposed dataset comprises mammograms with several lesions, such as masses, calcifications, architectural distortions, and asymmetries. In addition, a radiologist report is provided, describing the details of the breast, such as breast density, description of abnormality present, condition of the skin, nipple and pectoral muscles, etc., for each mammogram. Results:We present results of commonly used segmentation framework trained on our proposed dataset. We used information regarding the class of abnormalities (benign or malignant) and breast tissue density provided with each mammogram to analyze the segmentation model's performance concerning these parameters. Conclusion:The presented dataset provides diverse mammogram images to develop and train models for breast cancer diagnosis applications.

Keywords: Breast mass segmentation; Dataset; Deep learning; Mammogram.