A pathology foundation model for cancer diagnosis and prognosis prediction

Xiyue Wang; Junhan Zhao; Eliana Marostica; Wei Yuan; Jietian Jin; Jiayu Zhang; Ruijiang Li; Hongping Tang; Kanran Wang; Yu Li; Fang Wang; Yulong Peng; Junyou Zhu; Jing Zhang; Christopher R Jackson; Jun Zhang; Deborah Dillon; Nancy U Lin; Lynette Sholl; Thomas Denize; David Meredith; Keith L Ligon; Sabina Signoretti; Shuji Ogino; Jeffrey A Golden; MacLean P Nasrallah; Xiao Han; Sen Yang; Kun-Hsing Yu

doi:10.1038/s41586-024-07894-z

A pathology foundation model for cancer diagnosis and prognosis prediction

Nature. 2024 Oct;634(8035):970-978. doi: 10.1038/s41586-024-07894-z. Epub 2024 Sep 4.

Authors

Xiyue Wang^#^{1

2}, Junhan Zhao^#^{1

3}, Eliana Marostica^{1

4}, Wei Yuan⁵, Jietian Jin⁶, Jiayu Zhang⁵, Ruijiang Li², Hongping Tang⁷, Kanran Wang⁸, Yu Li⁹, Fang Wang¹⁰, Yulong Peng¹¹, Junyou Zhu¹², Jing Zhang⁵, Christopher R Jackson^{1

13

14}, Jun Zhang¹⁵, Deborah Dillon¹⁶, Nancy U Lin¹⁷, Lynette Sholl^{16

18}, Thomas Denize^{16

18}, David Meredith¹⁶, Keith L Ligon^{16

18}, Sabina Signoretti^{16

18}, Shuji Ogino^{16

19

20}, Jeffrey A Golden^{16

21}, MacLean P Nasrallah²², Xiao Han¹⁵, Sen Yang^{23

24}, Kun-Hsing Yu^{25

26

27}

Affiliations

¹ Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.
² Department of Radiation Oncology, Stanford University School of Medicine, Stanford, CA, USA.
³ Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
⁴ Division of Health Sciences and Technology, Harvard-Massachusetts Institute of Technology, Boston, MA, USA.
⁵ College of Biomedical Engineering, Sichuan University, Chengdu, China.
⁶ Department of Pathology, Sun Yat-sen University Cancer Center, Guangzhou, China.
⁷ Department of Pathology, Shenzhen Maternity & Child Healthcare Hospital, Shenzhen, China.
⁸ Department of Radiation Oncology, Chongqing University Cancer Hospital, Chongqing, China.
⁹ Department of Pathology, Chongqing University Cancer Hospital, Chongqing, China.
¹⁰ Department of Pathology, The Affiliated Yantai Yuhuangding Hospital of Qingdao University, Yantai, China.
¹¹ Department of Pathology, The First Affiliated Hospital of Jinan University, Guangzhou, China.
¹² Department of Burn, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, China.
¹³ Department of Pathology and Laboratory Medicine, Pennsylvania State University, Hummelstown, PA, USA.
¹⁴ Department of Pathology, Massachusetts General Hospital, Boston, MA, USA.
¹⁵ Tencent AI Lab, Shenzhen, China.
¹⁶ Department of Pathology, Brigham and Women's Hospital, Boston, MA, USA.
¹⁷ Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA.
¹⁸ Department of Pathology, Dana-Farber Cancer Institute, Boston, MA, USA.
¹⁹ Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
²⁰ Broad Institute of MIT and Harvard, Cambridge, MA, USA.
²¹ Department of Pathology, Cedars-Sinai Medical Center, Los Angeles, CA, USA.
²² Department of Pathology and Laboratory Medicine, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA, USA.
²³ Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA. [email protected].
²⁴ Department of Radiation Oncology, Stanford University School of Medicine, Stanford, CA, USA. [email protected].
²⁵ Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA. [email protected].
²⁶ Department of Pathology, Brigham and Women's Hospital, Boston, MA, USA. [email protected].
²⁷ Harvard Data Science Initiative, Harvard University, Cambridge, MA, USA. [email protected].

^# Contributed equally.

PMID: 39232164
DOI: 10.1038/s41586-024-07894-z

Abstract

Histopathology image evaluation is indispensable for cancer diagnoses and subtype classification. Standard artificial intelligence methods for histopathology image analyses have focused on optimizing specialized models for each diagnostic task^1,2. Although such methods have achieved some success, they often have limited generalizability to images generated by different digitization protocols or samples collected from different populations³. Here, to address this challenge, we devised the Clinical Histopathology Imaging Evaluation Foundation (CHIEF) model, a general-purpose weakly supervised machine learning framework to extract pathology imaging features for systematic cancer evaluation. CHIEF leverages two complementary pretraining methods to extract diverse pathology representations: unsupervised pretraining for tile-level feature identification and weakly supervised pretraining for whole-slide pattern recognition. We developed CHIEF using 60,530 whole-slide images spanning 19 anatomical sites. Through pretraining on 44 terabytes of high-resolution pathology imaging datasets, CHIEF extracted microscopic representations useful for cancer cell detection, tumour origin identification, molecular profile characterization and prognostic prediction. We successfully validated CHIEF using 19,491 whole-slide images from 32 independent slide sets collected from 24 hospitals and cohorts internationally. Overall, CHIEF outperformed the state-of-the-art deep learning methods by up to 36.1%, showing its ability to address domain shifts observed in samples from diverse populations and processed by different slide preparation methods. CHIEF provides a generalizable foundation for efficient digital pathology evaluation for patients with cancer.

Publication types

Validation Study

MeSH terms

Datasets as Topic
Deep Learning / standards
Female
Histocytochemistry*
Humans
Male
Neoplasms* / classification
Neoplasms* / diagnosis
Neoplasms* / pathology
Pathology, Clinical* / methods
Pathology, Clinical* / standards
Pattern Recognition, Automated* / methods
Pattern Recognition, Automated* / standards
Prognosis
Reproducibility of Results
Sensitivity and Specificity
Supervised Machine Learning / standards