External Validation of Ultrasound Radiomics for Small (≤ 4 cm) Renal Mass Differentiation: A Comparison with Radiologists

Curr Med Imaging. 2023 Nov 29. doi: 10.2174/0115734056268527231117074443. Online ahead of print.

Abstract

Background: Renal cell carcinoma, especially in small renal masses (≤ 4 cm) (SRM), has increased. Pathological analysis revealed a high proportion of benign masses, highlighting the urgent need for precise SRM differentiation.

Objectives: This research aimed to independently validate the performance of machine learning-based ultrasound (US) radiomics analysis in differentiating benign from malignant SRM, and to compare its performance with that of radiologists.

Methods: A total of 499 patients from two hospitals were retrospectively included in this study and divided into two cohorts. US images were used to extract radiomics features. To obtain the most robust features, inter-observer correlation coefficient, Spearman correlation coefficient, and least absolute shrinkage and selection operator methods were applied for feature selection. Three models were developed in the training data using the stochastic gradient boosting algorithm, including a clinical model, a radiomics model, and a combined model that integrated clinical factors and radiomics features. The performance of these models was evaluated in the independent external validation data, including discrimination, calibration, and clinical usefulness, and compared with pooled radiologists' assessments.

Results: The AUCs of the clinical, radiomics, and combined models were 0.844, 0.942, and 0.954, respectively. The radiomics and combined models significantly outperformed the clinical model (all p < 0.05), while no significant difference was observed between them (p = 0.32). The radiomics and combined models showed good discrimination and calibration. Decision curve analysis exhibited that the combined model had clinical usefulness. Compared with the pooled radiologists' assessment (AUC, 0.799), the combined model showed superior classification results (p < 0.01) and higher specificity (p < 0.01) with similar sensitivity (p = 0.62).

Conclusion: The combined model incorporating clinical factors and radiomics features accurately distinguished benign from malignant SRM.

Keywords: decision curve analysis; machine learning; radiomics; renal cell carcinoma; small renal masses; ultrasound.