How Data Infrastructure Deals with Bias Problems in Medical Imaging

Feifei Li; Ekaterina Kutafina; Mirjam Schoneck; Liliana Lourenco Caldeira; Oya Beyan

doi:10.3233/SHTI240517

How Data Infrastructure Deals with Bias Problems in Medical Imaging

Stud Health Technol Inform. 2024 Aug 22:316:726-730. doi: 10.3233/SHTI240517.

Authors

Feifei Li¹, Ekaterina Kutafina¹, Mirjam Schoneck², Liliana Lourenco Caldeira², Oya Beyan^{1

3}

Affiliations

¹ Institute for Biomedical Informatics, University of Cologne, Faculty of Medicine and University Hospital Cologne, Germany.
² Institute for Diagnostic and Interventional Radiology, University of Cologne, Faculty of Medicine and University Hospital Cologne, Germany.
³ Fraunhofer Institute for Applied Information Technology, FIT, Germany.

PMID: 39176898
DOI: 10.3233/SHTI240517

Abstract

The paper discusses biases in medical imaging analysis, particularly focusing on the challenges posed by the development of machine learning algorithms and generative models. It introduces a taxonomy of bias problems and addresses them through a data infrastructure initiative: the PADME (Platform for Analytics and Distributed Machine-Learning for Enterprises), which is a part of the National Research Data Infrastructure for Personal Health Data (NFDI4Health) project. The PADME facilitates the structuring and sharing of health data while ensuring privacy and adherence to FAIR principles. The paper presents experimental results that show that generative methods can be effective in data augmentation. Complying with PADME infrastructure, this work proposes a solution framework to deal with bias in the different data stations and preserve privacy when transferring images. It highlights the importance of standardized data infrastructure in mitigating biases and promoting FAIR, reusable, and privacy-preserving research environments in healthcare.

Keywords: Bias; Data Infrastructure; Differential Privacy; Federated Learning; Machine Learning; Medical Imaging.

MeSH terms

Algorithms
Bias
Computer Security
Confidentiality
Diagnostic Imaging*
Humans
Machine Learning*