Generative Pre-trained Transformer 4 analysis of cardiovascular magnetic resonance reports in suspected myocarditis: A multicenter study

Kenan Kaya; Carsten Gietzen; Robert Hahnfeldt; Maher Zoubi; Tilman Emrich; Moritz C Halfmann; Malte Maria Sieren; Yannic Elser; Patrick Krumm; Jan M Brendel; Konstantin Nikolaou; Nina Haag; Jan Borggrefe; Ricarda von Krüchten; Katharina Müller-Peltzer; Constantin Ehrengut; Timm Denecke; Andreas Hagendorff; Lukas Goertz; Roman J Gertz; Alexander Christian Bunck; David Maintz; Thorsten Persigehl; Simon Lennartz; Julian A Luetkens; Astha Jaiswal; Andra Iza Iuga; Lenhard Pennig; Jonathan Kottlors

doi:10.1016/j.jocmr.2024.101068

Generative Pre-trained Transformer 4 analysis of cardiovascular magnetic resonance reports in suspected myocarditis: A multicenter study

J Cardiovasc Magn Reson. 2024;26(2):101068. doi: 10.1016/j.jocmr.2024.101068. Epub 2024 Jul 28.

Authors

Kenan Kaya¹, Carsten Gietzen², Robert Hahnfeldt², Maher Zoubi³, Tilman Emrich⁴, Moritz C Halfmann⁵, Malte Maria Sieren⁶, Yannic Elser⁷, Patrick Krumm⁸, Jan M Brendel⁸, Konstantin Nikolaou⁸, Nina Haag⁹, Jan Borggrefe⁹, Ricarda von Krüchten¹⁰, Katharina Müller-Peltzer¹⁰, Constantin Ehrengut¹¹, Timm Denecke¹¹, Andreas Hagendorff¹², Lukas Goertz², Roman J Gertz², Alexander Christian Bunck², David Maintz², Thorsten Persigehl², Simon Lennartz², Julian A Luetkens³, Astha Jaiswal², Andra Iza Iuga², Lenhard Pennig², Jonathan Kottlors²

Affiliations

¹ Institute for Diagnostic and Interventional Radiology, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany. Electronic address: [email protected].
² Institute for Diagnostic and Interventional Radiology, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany.
³ Institute for Diagnostic and Interventional Radiology, Faculty of Medicine and University Hospital Bonn, University of Bonn, Bonn, Germany.
⁴ Department of Diagnostic and Interventional Radiology, University Medical Center of the Johannes-Gutenberg-University, Mainz, Germany; Division of Cardiovascular Imaging, Department of Radiology and Radiological Science, Medical University of South Carolina, Charleston, South Carolina, USA; German Centre for Cardiovascular Research, Partner Site Rhine-Main, Mainz, Germany.
⁵ Department of Diagnostic and Interventional Radiology, University Medical Center of the Johannes-Gutenberg-University, Mainz, Germany.
⁶ Department of Radiology and Nuclear Medicine, UKSH, Campus Lübeck, Lübeck, Germany; Institute of Interventional Radiology, UKSH, Campus Lübeck, Lübeck, Germany.
⁷ Department of Radiology and Nuclear Medicine, UKSH, Campus Lübeck, Lübeck, Germany.
⁸ Department of Radiology, Diagnostic and Interventional Radiology, University of Tübingen, Tübingen, Germany.
⁹ Institute for Radiology, Neuroradiology and Nuclear Medicine Johannes Wesling University Hospital/Mühlenkreiskliniken, Bochum/Minden, Germany.
¹⁰ Department of Diagnostic and Interventional Radiology, Medical Center, Faculty of Medicine, University of Freiburg, Freiburg, Germany.
¹¹ Department of Diagnostic and Interventional Radiology, University of Leipzig, Leipzig, Germany.
¹² Department of Cardiology, University of Leipzig, Leipzig, Germany.

Abstract

Background: Diagnosing myocarditis relies on multimodal data, including cardiovascular magnetic resonance (CMR), clinical symptoms, and blood values. The correct interpretation and integration of CMR findings require radiological expertise and knowledge. We aimed to investigate the performance of Generative Pre-trained Transformer 4 (GPT-4), a large language model, for report-based medical decision-making in the context of cardiac MRI for suspected myocarditis.

Methods: This retrospective study includes CMR reports from 396 patients with suspected myocarditis and eight centers, respectively. CMR reports and patient data including blood values, age, and further clinical information were provided to GPT-4 and radiologists with 1 (resident 1), 2 (resident 2), and 4 years (resident 3) of experience in CMR and knowledge of the 2018 Lake Louise Criteria. The final impression of the report regarding the radiological assessment of whether myocarditis is present or not was not provided. The performance of Generative pre-trained transformer 4 (GPT-4) and the human readers were compared to a consensus reading (two board-certified radiologists with 8 and 10 years of experience in CMR). Sensitivity, specificity, and accuracy were calculated.

Results: GPT-4 yielded an accuracy of 83%, sensitivity of 90%, and specificity of 78%, which was comparable to the physician with 1 year of experience (R1: 86%, 90%, 84%, p = 0.14) and lower than that of more experienced physicians (R2: 89%, 86%, 91%, p = 0.007 and R3: 91%, 85%, 96%, p < 0.001). GPT-4 and human readers showed a higher diagnostic performance when results from T1- and T2-mapping sequences were part of the reports, for residents 1 and 3 with statistical significance (p = 0.004 and p = 0.02, respectively).

Conclusion: GPT-4 yielded good accuracy for diagnosing myocarditis based on CMR reports in a large dataset from multiple centers and therefore holds the potential to serve as a diagnostic decision-supporting tool in this capacity, particularly for less experienced physicians. Further studies are required to explore the full potential and elucidate educational aspects of the integration of large language models in medical decision-making.

Keywords: Artificial intelligence; Cardiovascular magnetic resonance; Generative Pre-trained Transformer 4; Large language models; Myocarditis.

Publication types

Multicenter Study

MeSH terms

Adult
Aged
Clinical Decision-Making
Decision Support Techniques
Europe
Female
Humans
Image Interpretation, Computer-Assisted
Magnetic Resonance Imaging
Magnetic Resonance Imaging, Cine
Male
Middle Aged
Myocarditis* / diagnostic imaging
Observer Variation
Predictive Value of Tests*
Reproducibility of Results
Retrospective Studies
Young Adult