A comparison of machine learning methods for predicting recurrence and death after curative-intent radiotherapy for non-small cell lung cancer: Development and validation of multivariable clinical prediction models

Sumeet Hindocha; Thomas G Charlton; Kristofer Linton-Reid; Benjamin Hunter; Charleen Chan; Merina Ahmed; Emily J Robinson; Matthew Orton; Shahreen Ahmad; Fiona McDonald; Imogen Locke; Danielle Power; Matthew Blackledge; Richard W Lee; Eric O Aboagye

doi:10.1016/j.ebiom.2022.103911

A comparison of machine learning methods for predicting recurrence and death after curative-intent radiotherapy for non-small cell lung cancer: Development and validation of multivariable clinical prediction models

EBioMedicine. 2022 Mar:77:103911. doi: 10.1016/j.ebiom.2022.103911. Epub 2022 Mar 3.

Authors

Affiliations

¹ Lung Unit, The Royal Marsden NHS Foundation Trust, Fulham Road, London SW36JJ, UK; AI for Healthcare Centre for Doctoral Training, Imperial College London, Exhibition Road, London SW7 2BX, UK; Department of Clinical Oncology, Institute of Cancer Research NIHR Biomedical Research Centre, London, UK; Cancer Imaging Centre, Department of Surgery and Cancer, Imperial College London, Du Cane Road, London W12 0NN, UK; Early Diagnosis and Detection Centre, National Institute for Health Research (NIHR) Biomedical Research Centre at The Royal Marsden NHS Foundation Trust and the Institute of Cancer Research, London.
² Guy's Cancer Centre, Guy's and St Thomas' NHS Foundation Trust, Great Maze Pond, London SE19RT UK.
³ Cancer Imaging Centre, Department of Surgery and Cancer, Imperial College London, Du Cane Road, London W12 0NN, UK.
⁴ Lung Unit, The Royal Marsden NHS Foundation Trust, Fulham Road, London SW36JJ, UK; Department of Clinical Oncology, Institute of Cancer Research NIHR Biomedical Research Centre, London, UK; Cancer Imaging Centre, Department of Surgery and Cancer, Imperial College London, Du Cane Road, London W12 0NN, UK; Early Diagnosis and Detection Centre, National Institute for Health Research (NIHR) Biomedical Research Centre at The Royal Marsden NHS Foundation Trust and the Institute of Cancer Research, London.
⁵ Department of Clinical Oncology, Institute of Cancer Research NIHR Biomedical Research Centre, London, UK.
⁶ Lung Unit, The Royal Marsden NHS Foundation Trust, Downs Road, Sutton SM25PT, UK.
⁷ Clinical Trials Unit, Royal Marsden NHS Foundation Trust, Downs Road, Sutton SM25PT, UK.
⁸ Artificial Intelligence Imaging Hub, Royal Marsden NHS Foundation Trust, Downs Road, Sutton SM25PT, UK.
⁹ Lung Unit, The Royal Marsden NHS Foundation Trust, Fulham Road, London SW36JJ, UK; Department of Clinical Oncology, Institute of Cancer Research NIHR Biomedical Research Centre, London, UK.
¹⁰ Department of Clinical Oncology, Charing Cross Hospital, Fulham Palace Road, London W6 8RF, UK.
¹¹ Radiotherapy and Imaging, Institute of Cancer Research, 123 Old Brompton Road, London SW7 3RP, UK.
¹² Lung Unit, The Royal Marsden NHS Foundation Trust, Fulham Road, London SW36JJ, UK; Early Diagnosis and Detection Centre, National Institute for Health Research (NIHR) Biomedical Research Centre at The Royal Marsden NHS Foundation Trust and the Institute of Cancer Research, London; National Heart and Lung Institute, Imperial College, London, UK. Electronic address: [email protected].
¹³ Department of Clinical Oncology, Institute of Cancer Research NIHR Biomedical Research Centre, London, UK. Electronic address: [email protected].

Abstract

Background: Surveillance is universally recommended for non-small cell lung cancer (NSCLC) patients treated with curative-intent radiotherapy. High-quality evidence to inform optimal surveillance strategies is lacking. Machine learning demonstrates promise in accurate outcome prediction for a variety of health conditions. The purpose of this study was to utilise readily available patient, tumour, and treatment data to develop, validate and externally test machine learning models for predicting recurrence, recurrence-free survival (RFS) and overall survival (OS) at 2 years from treatment.

Methods: A retrospective, multicentre study of patients receiving curative-intent radiotherapy for NSCLC was undertaken. A total of 657 patients from 5 hospitals were eligible for inclusion. Data pre-processing derived 34 features for predictive modelling. Combinations of 8 feature reduction methods and 10 machine learning classification algorithms were compared, producing risk-stratification models for predicting recurrence, RFS and OS. Models were compared with 10-fold cross validation and an external test set and benchmarked against TNM-stage and performance status. Youden Index was derived from validation set ROC curves to distinguish high and low risk groups and Kaplan-Meier analyses performed.

Findings: Median follow-up time was 852 days. Parameters were well matched across training-validation and external test sets: Mean age was 73 and 71 respectively, and recurrence, RFS and OS rates at 2 years were 43% vs 34%, 54% vs 47% and 54% vs 47% respectively. The respective validation and test set AUCs were as follows: 1) RFS: 0·682 (0·575-0·788) and 0·681 (0·597-0·766), 2) Recurrence: 0·687 (0·582-0·793) and 0·722 (0·635-0·81), and 3) OS: 0·759 (0·663-0·855) and 0·717 (0·634-0·8). Our models were superior to TNM stage and performance status in predicting recurrence and OS.

Interpretation: This robust and ready to use machine learning method, validated and externally tested, sets the stage for future clinical trials entailing quantitative personalised risk-stratification and surveillance following curative-intent radiotherapy for NSCLC.

Funding: A full list of funding bodies that contributed to this study can be found in the Acknowledgements section.

Keywords: Early detection; Machine learning; Non-small cell lung cancer; Overall survival; Prediction; Radiotherapy; Recurrence.

Publication types

Multicenter Study

MeSH terms

Carcinoma, Non-Small-Cell Lung* / diagnosis
Carcinoma, Non-Small-Cell Lung* / drug therapy
Carcinoma, Non-Small-Cell Lung* / radiotherapy
Humans
Lung Neoplasms* / diagnosis
Lung Neoplasms* / drug therapy
Lung Neoplasms* / radiotherapy
Machine Learning
Models, Statistical
Neoplasm Staging
Prognosis
Retrospective Studies