A systematic pan-cancer study on deep learning-based prediction of multi-omic biomarkers from routine pathology images

Commun Med (Lond). 2024 Mar 15;4(1):48. doi: 10.1038/s43856-024-00471-5.

Abstract

Background: The objective of this comprehensive pan-cancer study is to evaluate the potential of deep learning (DL) for molecular profiling of multi-omic biomarkers directly from hematoxylin and eosin (H&E)-stained whole slide images.

Methods: A total of 12,093 DL models predicting 4031 multi-omic biomarkers across 32 cancer types were trained and validated. The study included a broad range of genetic, transcriptomic, and proteomic biomarkers, as well as established prognostic markers, molecular subtypes, and clinical outcomes.

Results: Here we show that 50% of the models achieve an area under the curve (AUC) of 0.644 or higher. The observed AUC for 25% of the models is at least 0.719 and exceeds 0.834 for the top 5%. Molecular profiling with image-based histomorphological features is generally considered feasible for most of the investigated biomarkers and across different cancer types. The performance appears to be independent of tumor purity, sample size, and class ratio (prevalence), suggesting a degree of inherent predictability in histomorphology.

Conclusions: The results demonstrate that DL holds promise to predict a wide range of biomarkers across the omics spectrum using only H&E-stained histological slides of solid tumors. This paves the way for accelerating diagnosis and developing more precise treatments for cancer patients.

Plain language summary

Molecular profiling tests are used to check cancers for changes in certain genes, proteins, or other molecules. Results of such tests can be used to identify the most effective treatment for cancer patients. Faster and more accessible alternatives to standard tests are needed to improve cancer care. This study investigates whether deep learning (DL), a series of advanced computer techniques, can perform molecular profiling directly from routinely-collected images of tumor specimens used for diagnostic purposes. Over 12,000 DL models were utilized to evaluate thousands of biomarkers using statistical approaches. The results indicate that DL can effectively detect molecular changes in a tumor from these images, for many biomarkers and tumor types. The study shows that DL-based molecular profiling from images is possible. Introducing this type of approach into routine clinical workflows could potentially accelerate treatment decisions and improve outcomes.