Objectives: Distinguishing between kidney stones and phleboliths can constitute a diagnostic challenge in patients undergoing unenhanced low-dose CT (LDCT) for acute flank pain. We sought to investigate the accuracy of radiomics and a machine-learning classifier in differentiating between kidney stones and phleboliths on LDCT.
Methods: Radiomics features were extracted following a semi-automatic segmentation of kidney stones and phleboliths for two independent consecutive cohorts of patients undergoing LDCT for acute flank pain. Radiomics features from the first cohort of patients (n = 369) were ultimately used to train a machine-learning model designed to distinguish kidney stones (n = 211) from phleboliths (n = 201). Classification performance was assessed on the second independent cohort (i.e., testing set) (kidney stones n = 24; phleboliths n = 23) using positive and negative predictive values (PPV and NPV), area under the receiver operating curves (AUC), and permutation testing.
Results: Our machine-learning classification model trained on radiomics features achieved an overall accuracy of 85.1% on the independent testing set, with an AUC of 0.902, PPV of 81.5%, and NPV of 90.0%. Classification accuracy was significantly better than chance on permutation testing (p < 0.05, permutation p value).
Conclusion: Radiomics and machine learning enable accurate differentiation between kidney stones and phleboliths on LDCT in patients presenting with acute flank pain.
Key points: • Combining a machine-learning algorithm with radiomics features extracted for abdominopelvic calcification on LDCT offers a highly accurate method for discriminating phleboliths from kidney stones. • Our radiomics and machine-learning model proved robust for CT acquisition and reconstruction protocol when tested in comparison with an external independent cohort of patients with acute flank pain. • The high performance of the radiomics-based automatic classification model in differentiating phleboliths from kidney stones indicates its potential as a future diagnostic tool for equivocal abdominopelvic calcifications in the setting of suspected renal colic.
Keywords: Artificial intelligence; Lithiasis; Machine learning; Urinary tract.