Quantitative CT and machine learning classification of fibrotic interstitial lung diseases

Chi Wan Koo; James M Williams; Grace Liu; Ananya Panda; Parth P Patel; Livia Maria M Frota Lima; Ronald A Karwoski; Teng Moua; Nicholas B Larson; Alex Bratt

doi:10.1007/s00330-022-08875-4

Quantitative CT and machine learning classification of fibrotic interstitial lung diseases

Eur Radiol. 2022 Dec;32(12):8152-8161. doi: 10.1007/s00330-022-08875-4. Epub 2022 Jun 9.

Authors

Affiliations

¹ Department of Radiology, Mayo Clinic, 200 First Street SW, Rochester, MN, 55905, USA. [email protected].
² Department of Radiology, Mayo Clinic, 200 First Street SW, Rochester, MN, 55905, USA.
³ Mayo Clinic Alix School of Medicine, Mayo Clinic, Jacksonville, FL, USA.
⁴ Department of Information Technology, Division of Biomedical Imaging Resources, Mayo Clinic, Rochester, MN, USA.
⁵ Department of Medicine, Division of Pulmonary, Critical Care and Sleep Medicine, Mayo Clinic, Rochester, MN, USA.
⁶ Department of Quantitative Health Sciences, Division of Clinical Trials and Biostatistics, Mayo Clinic, Rochester, MN, USA.

PMID: 35678861
DOI: 10.1007/s00330-022-08875-4

Abstract

Objectives: To evaluate quantitative computed tomography (QCT) features and QCT feature-based machine learning (ML) models in classifying interstitial lung diseases (ILDs). To compare QCT-ML and deep learning (DL) models' performance.

Methods: We retrospectively identified 1085 patients with pathologically proven usual interstitial pneumonitis (UIP), nonspecific interstitial pneumonitis (NSIP), and chronic hypersensitivity pneumonitis (CHP) who underwent peri-biopsy chest CT. Kruskal-Wallis test evaluated QCT feature associations with each ILD. QCT features, patient demographics, and pulmonary function test (PFT) results trained eXtreme Gradient Boosting (training/validation set n = 911) yielding 3 models: M1 = QCT features only; M2 = M1 plus age and sex; M3 = M2 plus PFT results. A DL model was also developed. ML and DL model areas under the receiver operating characteristic curve (AUC) and 95% confidence intervals (CIs) were compared for multiclass (UIP vs. NSIP vs. CHP) and binary (UIP vs. non-UIP) classification performances.

Results: The majority (69/78 [88%]) of QCT features successfully differentiated the 3 ILDs (adjusted p ≤ 0.05). All QCT-ML models achieved higher AUC than the DL model (multiclass AUC micro-averages 0.910, 0.910, 0.925, and 0.798 and macro-averages 0.895, 0.893, 0.925, and 0.779 for M1, M2, M3, and DL respectively; binary AUC 0.880, 0.899, 0.898, and 0.869 for M1, M2, M3, and DL respectively). M3 demonstrated statistically significant better performance compared to M2 (∆AUC: 0.015, CI: [0.002, 0.029]) for multiclass prediction.

Conclusions: QCT features successfully differentiated pathologically proven UIP, NSIP, and CHP. While QCT-based ML models outperformed a DL model for classifying ILDs, further investigations are warranted to determine if QCT-ML, DL, or a combination will be superior in ILD classification.

Key points: • Quantitative CT features successfully differentiated pathologically proven UIP, NSIP, and CHP. • Our quantitative CT-based machine learning models demonstrated high performance in classifying UIP, NSIP, and CHP histopathology, outperforming a deep learning model. • While our quantitative CT-based machine learning models performed better than a DL model, additional investigations are needed to determine whether either or a combination of both approaches delivers superior diagnostic performance.

Keywords: Chronic hypersensitivity pneumonitis; Interstitial lung disease; Machine learning; Nonspecific interstitial pneumonitis; Usual interstitial pneumonitis.

MeSH terms

Alveolitis, Extrinsic Allergic* / pathology
Humans
Idiopathic Interstitial Pneumonias* / pathology
Idiopathic Pulmonary Fibrosis* / pathology
Lung / diagnostic imaging
Lung / pathology
Lung Diseases, Interstitial* / diagnostic imaging
Machine Learning
Retrospective Studies
Tomography, X-Ray Computed / methods