Overcoming data scarcity in biomedical imaging with a foundational multi-task model

Raphael Schäfer; Till Nicke; Henning Höfener; Annkristin Lange; Dorit Merhof; Friedrich Feuerhake; Volkmar Schulz; Johannes Lotz; Fabian Kiessling

doi:10.1038/s43588-024-00662-z

Overcoming data scarcity in biomedical imaging with a foundational multi-task model

Nat Comput Sci. 2024 Jul;4(7):495-509. doi: 10.1038/s43588-024-00662-z. Epub 2024 Jul 19.

Authors

Raphael Schäfer¹, Till Nicke¹, Henning Höfener¹, Annkristin Lange¹, Dorit Merhof^{1

2}, Friedrich Feuerhake^{3

4}, Volkmar Schulz^{1

5}, Johannes Lotz^#⁶, Fabian Kiessling^#^{7

8}

Affiliations

¹ Fraunhofer Institute for Digital Medicine MEVIS, Bremen, Germany.
² Institute of Image Analysis and Computer Vision, Faculty of Informatics and Data Science, University of Regensburg, Regensburg, Germany.
³ Institute for Pathology, Hannover Medical School, Hanover, Germany.
⁴ Institute for Neuropathology, Medical Center, University of Freiburg, Freiburg, Germany.
⁵ Institute for Experimental Molecular Imaging, RWTH Aachen University, Aachen, Germany.
⁶ Fraunhofer Institute for Digital Medicine MEVIS, Bremen, Germany. [email protected].
⁷ Fraunhofer Institute for Digital Medicine MEVIS, Bremen, Germany. [email protected].
⁸ Institute for Experimental Molecular Imaging, RWTH Aachen University, Aachen, Germany. [email protected].

^# Contributed equally.

Abstract

Foundational models, pretrained on a large scale, have demonstrated substantial success across non-medical domains. However, training these models typically requires large, comprehensive datasets, which contrasts with the smaller and more specialized datasets common in biomedical imaging. Here we propose a multi-task learning strategy that decouples the number of training tasks from memory requirements. We trained a universal biomedical pretrained model (UMedPT) on a multi-task database including tomographic, microscopic and X-ray images, with various labeling strategies such as classification, segmentation and object detection. The UMedPT foundational model outperformed ImageNet pretraining and previous state-of-the-art models. For classification tasks related to the pretraining database, it maintained its performance with only 1% of the original training data and without fine-tuning. For out-of-domain tasks it required only 50% of the original training data. In an external independent validation, imaging features extracted using UMedPT proved to set a new standard for cross-center transferability.

MeSH terms

Algorithms
Databases, Factual
Diagnostic Imaging / methods
Humans
Image Processing, Computer-Assisted* / methods
Machine Learning