Objectives: To evaluate the diagnostic performance of Kaiser score (KS) adjusted with the apparent diffusion coefficient (ADC) (KS+) and machine learning (ML) modeling.
Methods: A dataset of 402 malignant and 257 benign lesions was identified. Two radiologists assigned the KS. If a lesion with KS > 4 had ADC > 1.4 × 10-3 mm2/s, the KS was reduced by 4 to become KS+. In order to consider the full spectrum of ADC as a continuous variable, the KS and ADC values were used to train diagnostic models using 5 ML algorithms. The performance was evaluated using the ROC analysis, compared by the DeLong test. The sensitivity, specificity, and accuracy achieved using the threshold of KS > 4, KS+ > 4, and ADC ≤ 1.4 × 10-3 mm2/s were obtained and compared by the McNemar test.
Results: The ROC curves of KS, KS+, and all ML models had comparable AUC in the range of 0.883-0.921, significantly higher than that of ADC (0.837, p < 0.0001). The KS had sensitivity = 97.3% and specificity = 59.1%; and the KS+ had sensitivity = 95.5% with significantly improved specificity to 68.5% (p < 0.0001). However, when setting at the same sensitivity of 97.3%, KS+ could not improve specificity. In ML analysis, the logistic regression model had the best performance. At sensitivity = 97.3% and specificity = 65.3%, i.e., compared to KS, 16 false-positives may be avoided without affecting true cancer diagnosis (p = 0.0015).
Conclusion: Using dichotomized ADC to modify KS to KS+ can improve specificity, but at the price of lowered sensitivity. Machine learning algorithms may be applied to consider the ADC as a continuous variable to build more accurate diagnostic models.
Key points: • When using ADC to modify the Kaiser score to KS+, the diagnostic specificity according to the results of two independent readers was improved by 9.4-9.7%, at the price of slightly degraded sensitivity by 1.5-1.8%, and overall had improved accuracy by 2.6-2.9%. • When the KS and the continuous ADC values were combined to train models by machine learning algorithms, the diagnostic specificity achieved by the logistic regression model could be significantly improved from 59.1 to 65.3% (p = 0.0015), while maintaining at the high sensitivity of KS = 97.3%, and thus, the results demonstrated the potential of ML modeling to further evaluate the contribution of ADC. • When setting the sensitivity at the same levels, the modified KS+ and the original KS have comparable specificity; therefore, KS+ with consideration of ADC may not offer much practical help, and the original KS without ADC remains as an excellent robust diagnostic method.
Keywords: Breast neoplasms; Diagnosis, Differential; Diffusion magnetic resonance imaging; Machine learning.
© 2022. The Author(s), under exclusive licence to European Society of Radiology.