Deep learning-based triage and analysis of lesion burden for COVID-19: a retrospective study with external validation

Lancet Digit Health. 2020 Oct;2(10):e506-e515. doi: 10.1016/S2589-7500(20)30199-0. Epub 2020 Sep 22.

Abstract

Background: Prompt identification of patients suspected to have COVID-19 is crucial for disease control. We aimed to develop a deep learning algorithm on the basis of chest CT for rapid triaging in fever clinics.

Methods: We trained a U-Net-based model on unenhanced chest CT scans obtained from 2447 patients admitted to Tongji Hospital (Wuhan, China) between Feb 1, 2020, and March 3, 2020 (1647 patients with RT-PCR-confirmed COVID-19 and 800 patients without COVID-19) to segment lung opacities and alert cases with COVID-19 imaging manifestations. The ability of artificial intelligence (AI) to triage patients suspected to have COVID-19 was assessed in a large external validation set, which included 2120 retrospectively collected consecutive cases from three fever clinics inside and outside the epidemic centre of Wuhan (Tianyou Hospital [Wuhan, China; area of high COVID-19 prevalence], Xianning Central Hospital [Xianning, China; area of medium COVID-19 prevalence], and The Second Xiangya Hospital [Changsha, China; area of low COVID-19 prevalence]) between Jan 22, 2020, and Feb 14, 2020. To validate the sensitivity of the algorithm in a larger sample of patients with COVID-19, we also included 761 chest CT scans from 722 patients with RT-PCR-confirmed COVID-19 treated in a makeshift hospital (Guanggu Fangcang Hospital, Wuhan, China) between Feb 21, 2020, and March 6, 2020. Additionally, the accuracy of AI was compared with a radiologist panel for the identification of lesion burden increase on pairs of CT scans obtained from 100 patients with COVID-19.

Findings: In the external validation set, using radiological reports as the reference standard, AI-aided triage achieved an area under the curve of 0·953 (95% CI 0·949-0·959), with a sensitivity of 0·923 (95% CI 0·914-0·932), specificity of 0·851 (0·842-0·860), a positive predictive value of 0·790 (0·777-0·803), and a negative predictive value of 0·948 (0·941-0·954). AI took a median of 0·55 min (IQR: 0·43-0·63) to flag a positive case, whereas radiologists took a median of 16·21 min (11·67-25·71) to draft a report and 23·06 min (15·67-39·20) to release a report. With regard to the identification of increases in lesion burden, AI achieved a sensitivity of 0·962 (95% CI 0·947-1·000) and a specificity of 0·875 (95 %CI 0·833-0·923). The agreement between AI and the radiologist panel was high (Cohen's kappa coefficient 0·839, 95% CI 0·718-0·940).

Interpretation: A deep learning algorithm for triaging patients with suspected COVID-19 at fever clinics was developed and externally validated. Given its high accuracy across populations with varied COVID-19 prevalence, integration of this system into the standard clinical workflow could expedite identification of chest CT scans with imaging indications of COVID-19.

Funding: Special Project for Emergency of the Science and Technology Department of Hubei Province, China.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adult
  • Aged
  • Algorithms
  • COVID-19 / diagnosis*
  • COVID-19 / diagnostic imaging
  • COVID-19 / pathology
  • COVID-19 / therapy
  • China
  • Deep Learning*
  • Female
  • Humans
  • Lung / diagnostic imaging
  • Male
  • Middle Aged
  • Reproducibility of Results
  • Retrospective Studies
  • Severity of Illness Index
  • Tomography, X-Ray Computed
  • Triage / methods*