An artificial intelligence framework integrating longitudinal electronic health records with real-world data enables continuous pan-cancer prognostication

Olivier Morin; Martin Vallières; Steve Braunstein; Jorge Barrios Ginart; Taman Upadhaya; Henry C Woodruff; Alex Zwanenburg; Avishek Chatterjee; Javier E Villanueva-Meyer; Gilmer Valdes; William Chen; Julian C Hong; Sue S Yom; Timothy D Solberg; Steffen Löck; Jan Seuntjens; Catherine Park; Philippe Lambin

doi:10.1038/s43018-021-00236-2

An artificial intelligence framework integrating longitudinal electronic health records with real-world data enables continuous pan-cancer prognostication

Nat Cancer. 2021 Jul;2(7):709-722. doi: 10.1038/s43018-021-00236-2. Epub 2021 Jul 22.

Authors

Olivier Morin¹, Martin Vallières^{2

3

4}, Steve Braunstein², Jorge Barrios Ginart², Taman Upadhaya², Henry C Woodruff^{5

6}, Alex Zwanenburg^{7

8

9

10

11}, Avishek Chatterjee^{3

5

6}, Javier E Villanueva-Meyer¹², Gilmer Valdes^{2

13}, William Chen², Julian C Hong^{2

14}, Sue S Yom², Timothy D Solberg², Steffen Löck⁷, Jan Seuntjens³, Catherine Park², Philippe Lambin^{5

6}

Affiliations

¹ Department of Radiation Oncology, University of California San Francisco, San Francisco, CA, USA. [email protected].
² Department of Radiation Oncology, University of California San Francisco, San Francisco, CA, USA.
³ Medical Physics Unit, McGill University, Montréal, Quebec, Canada.
⁴ Department of Computer Science, Université de Sherbrooke, Sherbrooke, Quebec, Canada.
⁵ The D-Lab, Department of Precision Medicine, GROW - School for Oncology and Developmental Biology, Maastricht University, Maastricht, the Netherlands.
⁶ Department of Radiology and Nuclear Medicine, GROW - School for Oncology and Developmental Biology, Maastricht University Medical Centre+, Maastricht, the Netherlands.
⁷ OncoRay - National Center for Radiation Research in Oncology, Faculty of Medicine and University Hospital Carl Gustav Carus, Technische Universität Dresden, Helmholtz-Zentrum Dresden - Rossendorf, Dresden, Germany.
⁸ National Center for Tumor Diseases (NCT), Partner Site Dresden, Dresden, Germany.
⁹ German Cancer Research Center (DKFZ), Heidelberg, Germany.
¹⁰ Faculty of Medicine and University Hospital Carl Gustav Carus, Technische Universität Dresden, Dresden, Germany.
¹¹ Helmholtz Association / Helmholtz-Zentrum Dresden - Rossendorf (HZDR), Dresden, Germany.
¹² Department of Radiology and Biomedical Imaging, University of California San Francisco, San Francisco, CA, USA.
¹³ Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, CA, USA.
¹⁴ Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, CA, USA.

PMID: 35121948
DOI: 10.1038/s43018-021-00236-2

Abstract

Despite widespread adoption of electronic health records (EHRs), most hospitals are not ready to implement data science research in the clinical pipelines. Here, we develop MEDomics, a continuously learning infrastructure through which multimodal health data are systematically organized and data quality is assessed with the goal of applying artificial intelligence for individual prognosis. Using this framework, currently composed of thousands of individuals with cancer and millions of data points over a decade of data recording, we demonstrate prognostic utility of this framework in oncology. As proof of concept, we report an analysis using this infrastructure, which identified the Framingham risk score to be robustly associated with mortality among individuals with early-stage and advanced-stage cancer, a potentially actionable finding from a real-world cohort of individuals with cancer. Finally, we show how natural language processing (NLP) of medical notes could be used to continuously update estimates of prognosis as a given individual's disease course unfolds.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Artificial Intelligence
Data Accuracy
Electronic Health Records*
Humans
Natural Language Processing
Neoplasms* / diagnosis

Grants and funding

FDN-143257/CIHR/Canada