External validation and comparison of the Brock model and Lung-RADS for the baseline lung cancer CT screening using data from the Korean Lung Cancer Screening Project

Eur Radiol. 2021 Jun;31(6):4004-4015. doi: 10.1007/s00330-020-07513-1. Epub 2020 Nov 25.

Abstract

Objectives: To validate and compare the performance of the Brock model and Lung CT Screening Reporting and Data System (Lung-RADS) on nodules detected by baseline CT screening.

Methods: We performed a secondary analysis of the Korean Lung Cancer Screening Project (K-LUCAS; ClinicalTrials.gov , NCT03394703), a nationwide, multicenter, prospective cohort study. From April 2017 to December 2018, low-dose CT screening was performed on high-risk subjects. Discrimination and calibration of Brock models 2a and 2b (i.e., full model without and with spiculation, respectively) were assessed, and discrimination was compared with that of Lung-RADS, which utilized subjective assessment categories 2b (b stands for benign) and 4X.

Results: Of the 13,150 subjects, 4578 were eligible (median age 62 years; 4458 men; 9929 nodules including 40 lung cancers). Areas under the receiver operating characteristic curve were 0.96 (IQR 0.92-0.99) for Brock model 2a, 0.96 (IQR 0.92-0.99) for Brock model 2b, and 0.95 (IQR 0.91-0.99) for Lung-RADS (p = 0.32 and p = 0.34, respectively). At an equivalent cutoff of 5%, Brock model 2b (sensitivity 87.5% [35/40]; specificity 93.6% [9259/9889]; positive predictive value [PPV] 5.3% [35/665]; negative predictive value [NPV] 99.9% [9259/9264]) and Lung-RADS (sensitivity 87.5% [35/40]; specificity 93.3% [9222/9889]; PPV 5.0% [35/702]; NPV 99.9% [9222/9227]) performed similarly well (all p > 0.05). The calibration performance of both Brock models 2a and 2b was poor (both p < 0.001).

Conclusions: Lung-RADS, when reinforced with visual assessment-based categories, has a similar diagnostic performance to the Brock model for baseline CT scans.

Key points: • Brock model 2b and Lung CT Screening Reporting and Data System (Lung-RADS) demonstrated a similar discrimination performance for lung cancer in the baseline CT screening (areas under the receiver operating characteristic curve 0.96 vs. 0.95; p = 0.34). • When visual assessment-based categories were removed from Lung-RADS, specificity and positive predictive value were lower than those of Brock model 2b (p = 0.001 and p = 0.02, respectively). • The Brock model showed poor calibration (p < 0.001).

Keywords: Diagnostic screening programs; Early detection of cancer; Lung neoplasms; Multidetector computed tomography; Statistical models.

Publication types

  • Multicenter Study

MeSH terms

  • Early Detection of Cancer
  • Humans
  • Lung / diagnostic imaging
  • Lung Neoplasms* / diagnostic imaging
  • Male
  • Middle Aged
  • Prospective Studies
  • Republic of Korea
  • Tomography, X-Ray Computed

Associated data

  • ClinicalTrials.gov/NCT03394703