Purpose: To investigate the effects of different methodologies on the performance of deep learning (DL) model for differentiating high- from low-grade clear cell renal cell carcinoma (ccRCC).
Method: Patients with pathologically proven ccRCC diagnosed between October 2009 and March 2019 were assigned to training or internal test dataset, and external test dataset was acquired from The Cancer Genome Atlas-Kidney Renal Clear Cell Carcinoma (TCGA-KIRC) database. The effects of different methodologies on the performance of DL-model, including image cropping (IC), setting the attention level, selecting model complexity (MC), and applying transfer learning (TL), were compared using repeated measures analysis of variance (ANOVA) and receiver operating characteristic (ROC) curve analysis. The performance of DL-model was evaluated through accuracy and ROC analyses with internal and external tests.
Results: In this retrospective study, patients (n = 390) from one hospital were randomly assigned to training (n = 370) or internal test dataset (n = 20), and the other 20 patients from TCGA-KIRC database were assigned to external test dataset. IC, the attention level, MC, and TL had major effects on the performance of the DL-model. The DL-model based on the cropping of an image less than three times the tumor diameter, without attention, a simple model and the application of TL achieved the best performance in internal (ACC = 73.7 ± 11.6%, AUC = 0.82 ± 0.11) and external (ACC = 77.9 ± 6.2%, AUC = 0.81 ± 0.04) tests.
Conclusions: CT-based DL model can be conveniently applied for grading ccRCC with simple IC in routine clinical practice.
Keywords: Artificial intelligence; Clear cell renal cell carcinoma; Deep learning; Radiomics; Tumor grading.
Copyright © 2020 Elsevier B.V. All rights reserved.