How Data Infrastructure Deals with Bias Problems in Medical Imaging

Stud Health Technol Inform. 2024 Aug 22:316:726-730. doi: 10.3233/SHTI240517.

Abstract

The paper discusses biases in medical imaging analysis, particularly focusing on the challenges posed by the development of machine learning algorithms and generative models. It introduces a taxonomy of bias problems and addresses them through a data infrastructure initiative: the PADME (Platform for Analytics and Distributed Machine-Learning for Enterprises), which is a part of the National Research Data Infrastructure for Personal Health Data (NFDI4Health) project. The PADME facilitates the structuring and sharing of health data while ensuring privacy and adherence to FAIR principles. The paper presents experimental results that show that generative methods can be effective in data augmentation. Complying with PADME infrastructure, this work proposes a solution framework to deal with bias in the different data stations and preserve privacy when transferring images. It highlights the importance of standardized data infrastructure in mitigating biases and promoting FAIR, reusable, and privacy-preserving research environments in healthcare.

Keywords: Bias; Data Infrastructure; Differential Privacy; Federated Learning; Machine Learning; Medical Imaging.

MeSH terms

  • Algorithms
  • Bias
  • Computer Security
  • Confidentiality
  • Diagnostic Imaging*
  • Humans
  • Machine Learning*