Objectives: The interpretability of convolutional neural networks (CNNs) for classifying subsolid nodules (SSNs) is insufficient for clinicians. Our purpose was to develop CNN models to classify SSNs on CT images and to investigate image features associated with the CNN classification.
Methods: CT images containing SSNs with a diameter of ≤ 3 cm were retrospectively collected. We trained and validated CNNs by a 5-fold cross-validation method for classifying SSNs into three categories (benign and preinvasive lesions [PL], minimally invasive adenocarcinoma [MIA], and invasive adenocarcinoma [IA]) that were histologically confirmed or followed up for 6.4 years. The mechanism of CNNs on human-recognizable CT image features was investigated and visualized by gradient-weighted class activation map (Grad-CAM), separated activation channels and areas, and DeepDream algorithm.
Results: The accuracy was 93% for classifying 586 SSNs from 569 patients into three categories (346 benign and PL, 144 MIA, and 96 IA in 5-fold cross-validation). The Grad-CAM successfully located the entire region of image features that determined the final classification. Activated areas in the benign and PL group were primarily smooth margins (p < 0.001) and ground-glass components (p = 0.033), whereas in the IA group, the activated areas were mainly part-solid (p < 0.001) and solid components (p < 0.001), lobulated shapes (p < 0.001), and air bronchograms (p < 0.001). However, the activated areas for MIA were variable. The DeepDream algorithm showed the image features in a human-recognizable pattern that the CNN learned from a training dataset.
Conclusion: This study provides medical evidence to interpret the mechanism of CNNs that helps support the clinical application of artificial intelligence.
Key points: • CNN achieved high accuracy (93%) in classifying subsolid nodules on CT images into three categories: benign and preinvasive lesions, MIA, and IA. • The gradient-weighted class activation map (Grad-CAM) located the entire region of image features that determined the final classification, and the visualization of the separated activated areas was consistent with radiologists' expertise for diagnosing subsolid nodules. • DeepDream showed the image features that CNN learned from a training dataset in a human-recognizable pattern.
Keywords: Adenocarcinoma of lung; Artificial intelligence; Deep learning; X-ray computed tomography.
© 2021. European Society of Radiology.