Focused active learning for histopathological image classification

Med Image Anal. 2024 Jul:95:103162. doi: 10.1016/j.media.2024.103162. Epub 2024 Apr 4.

Abstract

Active Learning (AL) has the potential to solve a major problem of digital pathology: the efficient acquisition of labeled data for machine learning algorithms. However, existing AL methods often struggle in realistic settings with artifacts, ambiguities, and class imbalances, as commonly seen in the medical field. The lack of precise uncertainty estimations leads to the acquisition of images with a low informative value. To address these challenges, we propose Focused Active Learning (FocAL), which combines a Bayesian Neural Network with Out-of-Distribution detection to estimate different uncertainties for the acquisition function. Specifically, the weighted epistemic uncertainty accounts for the class imbalance, aleatoric uncertainty for ambiguous images, and an OoD score for artifacts. We perform extensive experiments to validate our method on MNIST and the real-world Panda dataset for the classification of prostate cancer. The results confirm that other AL methods are 'distracted' by ambiguities and artifacts which harm the performance. FocAL effectively focuses on the most informative images, avoiding ambiguities and artifacts during acquisition. For both experiments, FocAL outperforms existing AL approaches, reaching a Cohen's kappa of 0.764 with only 0.69% of the labeled Panda data.

Keywords: Active learning; Bayesian deep learning; Cancer classification; Histopathological images.

MeSH terms

  • Algorithms
  • Artifacts
  • Bayes Theorem
  • Humans
  • Image Interpretation, Computer-Assisted / methods
  • Machine Learning
  • Male
  • Neural Networks, Computer
  • Prostatic Neoplasms* / diagnostic imaging
  • Prostatic Neoplasms* / pathology