Robustness and performance of radiomic features in diagnosing cystic renal masses

Arda Könik; Nityanand Miskin; Yang Guo; Atul B Shinagare; Lei Qin

doi:10.1007/s00261-021-03241-2

Robustness and performance of radiomic features in diagnosing cystic renal masses

Abdom Radiol (NY). 2021 Nov;46(11):5260-5267. doi: 10.1007/s00261-021-03241-2. Epub 2021 Aug 11.

Authors

Arda Könik¹, Nityanand Miskin², Yang Guo³, Atul B Shinagare⁴, Lei Qin⁵

Affiliations

¹ Imaging Department, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA. [email protected].
² Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA.
³ Department of Radiology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA.
⁴ Department of Radiology, Brigham and Women's Hospital and Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA.
⁵ Imaging Department, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA.

PMID: 34379150
DOI: 10.1007/s00261-021-03241-2

Abstract

Purpose: We study the inter-reader variability in manual delineation of cystic renal masses (CRMs) presented in computerized tomography (CT) images and its effect on the classification performance of a machine learning algorithm in distinguishing benign from potentially malignant CRMs. In addition, we assessed whether the inclusion of higher-order robust radiomic features improves the classification performance over the use of first-order features.

Methods: 230 CRMs were independently delineated by two radiologists. Through a combination of random fluctuations, dilation, and erosion operations over the original region of interests (ROIs), we generated four additional sets of synthetic ROIs to capture the inter-reader variability realistically, as confirmed by dice coefficient measurements and visual assessment. We then identified the robust features based on the intra-class coefficient (ICC > 0.85) across these datasets. We applied a tenfold stratified cross-validation (CV) to train and test the performance of the random forest model for the classification of CRMs into benign and potentially malignant.

Results: The mean area under the curve (AUC), sensitivity, specificity, positive predictive value, and negative predictive value were 0.87, 0.82, 0.90, 0.85, and 0.93, respectively. With the usage of first-order features alone, the corresponding values were nearly identical.

Conclusion: AUC ranged for the robust and uncorrelated features from 0.83 ± 0.09 to 0.93 ± 0.04 and for the first-order features from 0.84 ± 0.09 to 0.91 ± 0.04. Our study indicates that the first-order features alone are sufficient for the classification of CRMs, and that inclusion of higher-order features does not necessarily improve performance.

Keywords: Cystic renal mass; Inter-reader variability; Machine learning; Radiomics; Robust features.

MeSH terms

Humans
Kidney
Kidney Neoplasms* / diagnostic imaging
Machine Learning
Radiologists
Tomography, X-Ray Computed