Global parenchymal texture features based on histograms of oriented gradients improve cancer development risk estimation from healthy breasts

Comput Methods Programs Biomed. 2019 Aug:177:123-132. doi: 10.1016/j.cmpb.2019.05.022. Epub 2019 May 22.

Abstract

Background: The breast dense tissue percentage on digital mammograms is one of the most commonly used markers for breast cancer risk estimation. Geometric features of dense tissue over the breast and the presence of texture structures contained in sliding windows that scan the mammograms may improve the predictive ability when combined with the breast dense tissue percentage.

Methods: A case/control study nested within a screening program covering 1563 women with craniocaudal and mediolateral-oblique mammograms (755 controls and the contralateral breast mammograms at the closest screening visit before cancer diagnostic for 808 cases) aging 45 to 70 from Comunitat Valenciana (Spain) was used to extract geometric and texture features. The dense tissue segmentation was performed using DMScan and validated by two experienced radiologists. A model based on Random Forests was trained several times varying the set of variables. A training dataset of 1172 patients was evaluated with a 10-stratified-fold cross-validation scheme. The area under the Receiver Operating Characteristic curve (AUC) was the metric for the predictive ability. The results were assessed by only considering the output after applying the model to the test set, which was composed of the remaining 391 patients.

Results: The AUC score obtained by the dense tissue percentage (0.55) was compared to a machine learning-based classifier results. The classifier, apart from the percentage of dense tissue of both views, firstly included global geometric features such as the distance of dense tissue to the pectoral muscle, dense tissue eccentricity or the dense tissue perimeter, obtaining an accuracy of 0.56. By the inclusion of a global feature based on local histograms of oriented gradients, the accuracy of the classifier was significantly improved (0.61). The number of well-classified patients was improved up to 236 when it was 208.

Conclusion: Relative geometric features of dense tissue over the breast and histograms of standardized local texture features based on sliding windows scanning the whole breast improve risk prediction beyond the dense tissue percentage adjusted by geometrical variables. Other classifiers could improve the results obtained by the conventional Random Forests used in this study.

Keywords: Breast cancer; Breast density; Cancer development risk; Texture features.

MeSH terms

  • Aged
  • Algorithms
  • Area Under Curve
  • Breast / diagnostic imaging*
  • Breast Density
  • Breast Neoplasms / diagnostic imaging*
  • Case-Control Studies
  • False Positive Reactions
  • Female
  • Humans
  • Image Processing, Computer-Assisted / methods*
  • Machine Learning
  • Mammography*
  • Middle Aged
  • Parenchymal Tissue / diagnostic imaging
  • ROC Curve
  • Risk
  • Risk Assessment / methods*
  • Spain