Background: Artificial intelligence (AI) models are emerging as promising tools to identify predictive features among data coming from health records. Their application in clinical routine is still challenging, due to technical limits and to explainability issues in this specific setting. Response to standard first-line immunotherapy (ICI) in metastatic Non-Small-Cell Lung Cancer (NSCLC) is an interesting population for machine learning (ML), since up to 30% of patients do not benefit.
Methods: We retrospectively collected all consecutive patients with PD-L1 ≥ 50 % metastatic NSCLC treated with first-line ICI at our institution between 2017 and 2021. Demographic, laboratory, molecular and clinical data were retrieved manually or automatically according to data sources. Primary aim was to explore feasibility of ML models in clinical routine setting and to detect problems and solutions for everyday implementation. Early progression was used as preliminary endpoint to test our algorithm.
Results: Out of 123 patients, 106 were included, 52/106 (49 %) had disease progression or died within 3 months of start of ICI. Early progression correlated with increased neutrophil percentage (>80 % of white blood cells), neutrophil/lymphocyte ratio (≥8) and lower-range PD-L1 status (<70 %) at baseline, which was consistent with literature. Automated ML (AutoML) models run on our dataset reached precision scores around 80 %, with Voting Ensemble emerging as best performing model, while white-box models (such as Shapley Additive exPlanations) provided better explainability. In all AutoML models, laboratory features were the top selected features, whilst clinical ones needed more pre-processing before gaining relevance, which was consistent with different data extraction (automatic versus manual) and missing data rates.
Conclusions: ML models' application is feasible in clinical practice and can trustworthily predict early progression during first-line ICI for metastatic NSCLC. Solving pre-analytical issues is key for future improvement, focusing on automatic tools for data extraction, collection and explainability.
Keywords: Artificial intelligence application; Automated Machine Learning algorithms; Automatic data extraction; Explainable AI; White-box and black-box models pairing.
Copyright © 2024 Elsevier B.V. All rights reserved.