Experts fail to reliably detect AI-generated histological data

Jan Hartung; Stefanie Reuter; Vera Anna Kulow; Michael Fähling; Cord Spreckelsen; Ralf Mrowka

doi:10.1038/s41598-024-73913-8

Experts fail to reliably detect AI-generated histological data

Sci Rep. 2024 Nov 19;14(1):28677. doi: 10.1038/s41598-024-73913-8.

Authors

Jan Hartung^{1

2

3

4}, Stefanie Reuter⁵, Vera Anna Kulow⁶, Michael Fähling⁶, Cord Spreckelsen^{7

8}, Ralf Mrowka^{9

10}

Affiliations

¹ Institute for Physiology, Faculty of Medicine, University of Freiburg, 79108, Freiburg, Germany. [email protected].
² BrainLinks-BrainTools, IMBIT (Institute for Machine-Brain Interfacing Technology), University of Freiburg, Georges-Köhler-Allee 201, 79110, Freiburg, Germany. [email protected].
³ Department of Internal Medicine III, Experimental Nephrology, Jena University Hospital, Nonnenplan 4, 07745, Jena, Germany. [email protected].
⁴ Section of Translational Neuroimmunology, Department of Neurology, Jena University Hospital, 07747, Jena, Germany. [email protected].
⁵ ThIMEDOP, Jena University Hospital, Nonnenplan 4, 07745, Jena, Germany.
⁶ Charité - Universitätsmedizin Berlin, Corporate member of Freie Universität Berlin and Freie Universität Berlin and Humboldt-Universität zu Berlin, Institut für Translationale Physiologie (CCM), Charitéplatz 1, 10117, Berlin, Germany.
⁷ Institute of Medical Statistics, Computer and Data Sciences, Jena University Hospital, Bachstrase 18, 07743, Jena, Germany.
⁸ SMITH Consortium of the German Medical Informatics Initiative, Jena, Germany.
⁹ Department of Internal Medicine III, Experimental Nephrology, Jena University Hospital, Nonnenplan 4, 07745, Jena, Germany. [email protected].
¹⁰ ThIMEDOP, Jena University Hospital, Nonnenplan 4, 07745, Jena, Germany. [email protected].

Abstract

AI-based methods to generate images have seen unprecedented advances in recent years challenging both image forensic and human perceptual capabilities. Accordingly, these methods are expected to play an increasingly important role in the fraudulent fabrication of data. This includes images with complicated intrinsic structures such as histological tissue samples, which are harder to forge manually. Here, we use stable diffusion, one of the most recent generative algorithms, to create such a set of artificial histological samples. In a large study with over 800 participants, we study the ability of human subjects to discriminate between these artificial and genuine histological images. Although they perform better than naive participants, we find that even experts fail to reliably identify fabricated data. While participant performance depends on the amount of training data used, even low quantities are sufficient to create convincing images, necessitating methods and policies to detect fabricated data in scientific publications.

Keywords: Artificial intelligence; Fraud; Histology; Misconduct; Publishing; Stable diffusion.

MeSH terms

Algorithms*
Artificial Intelligence*
Humans
Image Processing, Computer-Assisted* / methods