A Machine Learning Platform to Optimize the Translation of Personalized Network Models to the Clinic

Manuela Salvucci; Arman Rahman; Alexa J Resler; Girish M Udupi; Deborah A McNamara; Elaine W Kay; Pierre Laurent-Puig; Daniel B Longley; Patrick G Johnston; Mark Lawler; Richard Wilson; Manuel Salto-Tellez; Sandra Van Schaeybroeck; Mairin Rafferty; William M Gallagher; Markus Rehm; Jochen H M Prehn

doi:10.1200/CCI.18.00056

A Machine Learning Platform to Optimize the Translation of Personalized Network Models to the Clinic

JCO Clin Cancer Inform. 2019 Apr:3:1-17. doi: 10.1200/CCI.18.00056.

Authors

Manuela Salvucci¹, Arman Rahman², Alexa J Resler¹, Girish M Udupi², Deborah A McNamara³, Elaine W Kay³, Pierre Laurent-Puig⁴, Daniel B Longley⁵, Patrick G Johnston⁵, Mark Lawler⁵, Richard Wilson⁵, Manuel Salto-Tellez⁵, Sandra Van Schaeybroeck⁵, Mairin Rafferty², William M Gallagher², Markus Rehm^{1

6}, Jochen H M Prehn¹

Affiliations

¹ Royal College of Surgeons in Ireland, Dublin, Ireland.
² OncoMark, Dublin, Ireland.
³ Beaumont Hospital, Dublin, Ireland.
⁴ Université Paris Descartes, Paris, France.
⁵ Queen's University Belfast, Belfast, United Kingdom.
⁶ University of Stuttgart, Stuttgart, Germany.

PMID: 30995124
DOI: 10.1200/CCI.18.00056

Abstract

Purpose: Dynamic network models predict clinical prognosis and inform therapeutic intervention by elucidating disease-driven aberrations at the systems level. However, the personalization of model predictions requires the profiling of multiple model inputs, which hampers clinical translation.

Patients and methods: We applied APOPTO-CELL, a prognostic model of apoptosis signaling, to showcase the establishment of computational platforms that require a reduced set of inputs. We designed two distinct and complementary pipelines: a probabilistic approach to exploit a consistent subpanel of inputs across the whole cohort (Ensemble) and a machine learning approach to identify a reduced protein set tailored for individual patients (Tree). Development was performed on a virtual cohort of 3,200,000 patients, with inputs estimated from clinically relevant protein profiles. Validation was carried out in an in-house stage III colorectal cancer cohort, with inputs profiled in surgical resections by reverse phase protein array (n = 120) and/or immunohistochemistry (n = 117).

Results: Ensemble and Tree reproduced APOPTO-CELL predictions in the virtual patient cohort with 92% and 99% accuracy while decreasing the number of inputs to a consistent subset of three proteins (40% reduction) or a personalized subset of 2.7 proteins on average (46% reduction), respectively. Ensemble and Tree retained prognostic utility in the in-house colorectal cancer cohort. The association between the Ensemble accuracy and prognostic value (Spearman ρ = 0.43; P = .02) provided a rationale to optimize the input composition for specific clinical settings. Comparison between profiling by reverse phase protein array (gold standard) and immunohistochemistry (clinical routine) revealed that the latter is a suitable technology to quantify model inputs.

Conclusion: This study provides a generalizable framework to optimize the development of network-based prognostic assays and, ultimately, to facilitate their integration in the routine clinical workflow.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Algorithms
Apoptosis*
Biomarkers, Tumor
Colorectal Neoplasms / diagnosis
Colorectal Neoplasms / etiology
Colorectal Neoplasms / metabolism
Computational Biology* / methods
Decision Support Systems, Clinical*
Decision Trees
Humans
Machine Learning*
Models, Biological*
Neoplasm Staging
Prognosis
Reproducibility of Results

Substances

Biomarkers, Tumor