Purpose: This multicenter retrospective study aims to identify reliable clinical and radiomic features to build machine learning models that predict progression-free survival (PFS) and overall survival (OS) in pancreatic ductal adenocarcinoma (PDAC) patients.
Methods: Between 2010 and 2020 pre-treatment contrast-enhanced CT scans of 287 pathology-confirmed PDAC patients from two sites of the Hopital Universitaire de Bruxelles (HUB) and from 47 hospitals within the HUB network were retrospectively analysed. Demographic, clinical, and survival data were also collected. Gross tumour volume (GTV) and non-tumoral pancreas (RPV) were semi-manually segmented and radiomics features were extracted. Patients from two HUB sites comprised the training dataset, while those from the remaining 47 hospitals of the HUB network constituted the testing dataset. A three-step method was used for feature selection. Based on the GradientBoostingSurvivalAnalysis classifier, different machine learning models were trained and tested to predict OS and PFS. Model performances were assessed using the C-index and Kaplan-Meier curves. SHAP analysis was applied to allow for post hoc interpretability.
Results: A total of 107 radiomics features were extracted from each of the GTV and RPV. Fourteen subgroups of features were selected: clinical, GTV, RPV, clinical & GTV, clinical & GTV & RPV, GTV-volume and RPV-volume both for OS and PFS. Subsequently, 14 Gradient Boosting Survival Analysis models were trained and tested. In the testing dataset, the clinical & GTV model demonstrated the highest performance for OS (C-index: 0.72) among all other models, while for PFS, the clinical model exhibited a superior performance (C-index: 0.70).
Conclusions: An integrated approach, combining clinical and radiomics features, excels in predicting OS, whereas clinical features demonstrate strong performance in PFS prediction.
Keywords: computed tomography (CT); pancreas; pancreatic ductal carcinomas; radiomics; survival analyses.