Objectives: We aimed to build a survival system by combining a highly-accurate machine learning (ML) model with explainable artificial intelligence (AI) techniques to predict distant metastasis in locoregionally advanced nasopharyngeal carcinoma (NPC) patients using magnetic resonance imaging (MRI)-based tumor burden features.
Materials and methods: 1643 patients from three hospitals were enrolled according to set criteria. We employed ML to develop a survival model based on tumor burden signatures and all clinical factors. Shapley Additive exPlanations (SHAP) was utilized to explain prediction results and interpret the complex non-linear relationship among features and distant metastasis. We also constructed other models based on routinely used cancer stages, Epstein-Barr virus (EBV) DNA, or other clinical features for comparison. Concordance index (C-index), receiver operating curve (ROC) analysis and decision curve analysis (DCA) were executed to assess the effectiveness of the models.
Results: Our proposed system consistently demonstrated promising performance across independent cohorts. The concordance indexes were 0.773, 0.766 and 0.760 in the training, internal validation and external validation sets. SHAP provided personalized protective and risk factors for each NPC patient and uncovered some novel non-linear relationships between features and distant metastasis. Furthermore, high-risk patients who received induction chemotherapy (ICT) and concurrent chemoradiotherapy (CCRT) had better 5-year distant metastasis-free survival (DMFS) than those who only received CCRT, whereas ICT + CCRT and CCRT had similar DMFS in low-risk patients.
Conclusions: The interpretable machine learning system demonstrated superior performance in predicting metastasis in locoregionally advanced NPC. High-risk patients might benefit from ICT.
Keywords: Machine learning; Nasopharyngeal carcinoma; Prognosis; Therapeutics; Tumor burden.
Copyright © 2021 Elsevier Ltd. All rights reserved.