Comparative study of Claude 3.5-Sonnet and human physicians in generating discharge summaries for patients with renal insufficiency: assessment of efficiency, accuracy, and quality

Haijiao Jin; Jinglu Guo; Qisheng Lin; Shaun Wu; Weiguo Hu; Xiaoyang Li

doi:10.3389/fdgth.2024.1456911

Comparative study of Claude 3.5-Sonnet and human physicians in generating discharge summaries for patients with renal insufficiency: assessment of efficiency, accuracy, and quality

Front Digit Health. 2024 Dec 5:6:1456911. doi: 10.3389/fdgth.2024.1456911. eCollection 2024.

Authors

Haijiao Jin^#^{1

2

3

4

5}, Jinglu Guo^#², Qisheng Lin^{1

3

4

5}, Shaun Wu⁶, Weiguo Hu⁷, Xiaoyang Li⁷

Affiliations

¹ Department of Nephrology, Ren Ji Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China.
² Department of Nephrology, Ningbo Hangzhou Bay Hospital, Zhejiang, China.
³ Molecular Cell Lab for Kidney Disease, Shanghai, China.
⁴ Shanghai Peritoneal Dialysis Research Center, Shanghai, China.
⁵ Uremia Diagnosis and Treatment Center, Shanghai Jiao Tong University School of Medicine, Shanghai, China.
⁶ WORK Medical Technology Group LTD., Hangzhou, China.
⁷ Department of Medical Education, Ruijin Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai, China.

^# Contributed equally.

Abstract

Background: The rapid development of artificial intelligence (AI) has shown great potential in medical document generation. This study aims to evaluate the performance of Claude 3.5-Sonnet, an advanced AI model, in generating discharge summaries for patients with renal insufficiency, compared to human physicians.

Methods: A prospective, comparative study was conducted involving 100 patients (50 with acute kidney injury and 50 with chronic kidney disease) from the nephrology department of Ningbo Hangzhou Bay Hospital between January and June 2024. Discharge summaries were independently generated by Claude 3.5-Sonnet and human physicians. The main evaluation indicators included accuracy, generation time, and overall quality.

Results: Claude 3.5-Sonnet demonstrated comparable accuracy to human physicians in generating discharge summaries for both AKI (90 vs. 92 points, p > 0.05) and CKD patients (88 vs. 90 points, p > 0.05). The AI model significantly outperformed human physicians in terms of efficiency, requiring only about 30 s to generate a summary compared to over 15 min for physicians (p < 0.001). The overall quality scores showed no significant difference between AI-generated and physician-written summaries for both AKI (26 vs. 27 points, p > 0.05) and CKD patients (25 vs. 26 points, p > 0.05).

Conclusion: Claude 3.5-Sonnet demonstrates high efficiency and reliability in generating discharge summaries for patients with renal insufficiency, with accuracy and quality comparable to those of human physicians. These findings suggest that AI has significant potential to improve the efficiency of medical documentation, though further research is needed to optimize its integration into clinical practice and address ethical and privacy concerns.

Keywords: artificial intelligence; claude 3.5-Sonnet; discharge summaries; medical documentation; renal insufficiency.

Grants and funding

The author(s) declare financial support was received for the research, authorship, and/or publication of this article. Shanghai Jiao Tong University School of Medicine Postgraduate Medical Education Program (BYH20230315, BYH20230316). Institute of Molecular Medicine, Shanghai Jiao Tong University School of Medicine, Shanghai Key Laboratory of Nucleic Acid Chemistry and Nanomedicine, “Clinical+” Excellence Project (2024ZY004).