Breast cancer (BC) is the most common malignancy in women worldwide. In the United States, the lifetime risk of developing an invasive form of breast cancer is 12.5% among women. BC arises in the lining cells (epithelium) of the ducts or lobules in the glandular tissue of the breast. The goal of the present study was to use machine learning (ML) as a novel technology to assess and compare the invasive forms of BC including, infiltrating ductal carcinoma, infiltrating lobular carcinoma, and mucinous carcinoma. To achieve this goal, we used ML algorithms and collected a dataset of 334 BC patients available at https://www.kaggle.com/amandam1/breastcancerdataset and interpreted this dataset based on the form of BC, age, sex, tumor stages, surgery type, and survival rate. Among the 334 patients, 70% were diagnosed with infiltrating ductal carcinoma, 27% with infiltrating lobular carcinoma, and 3% with mucinous carcinoma. Overall, out of 334 BC patients: 64 (19.16%) were in stage I, 189 (56.59%) in stage II, and 81 (24.25%) in stage III. Sixty-six, 67, 96, and 105 patients underwent lumpectomy, simple mastectomy, modified radical mastectomy, and other types of surgery, respectively. The survival rates were 83.4% for stage I, 79.1% for stage II, and 77% for stage III. Findings from the present study demonstrated that ML provides an important tool to curate large amount of BC data, as well as a scientific means to improve BC outcomes.
Keywords: Machine Learning; breast cancer; infiltrating ductal carcinoma; infiltrating lobular carcinoma; machine learning; mucinous carcinoma; surgery treatment.