Background and objectives: Deep learning utilizing convolutional neural networks (CNNs) applied to hematoxylin & eosin (H&E)-stained slides numerically encodes histomorphological tumor features. Tumor heterogeneity is an emerging biomarker in colon cancer that is, captured by these features, whereas microsatellite instability (MSI) is an established biomarker traditionally assessed by immunohistochemistry or polymerase chain reaction.
Methods: H&E-stained slides from The Cancer Genome Atlas (TCGA) colon cohort are passed through the CNN. Resulting imaging features are used to cluster morphologically similar slide regions. Tile-level pairwise similarities are calculated and used to generate a tumor heterogeneity score (THS). Patient-level THS is then correlated with TCGA-reported biomarkers, including MSI-status.
Results: H&E-stained images from 313 patients generated 534 771 tiles. Deep learning automatically identified and annotated cells by type and clustered morphologically similar slide regions. MSI-high tumors demonstrated significantly higher THS than MSS/MSI-low (p < 0.001). THS was higher in MLH1-silent versus non-silent tumors (p < 0.001). The sequencing derived MSIsensor score also correlated with THS (r = 0.51, p < 0.0001).
Conclusions: Deep learning provides spatially resolved visualization of imaging-derived biomarkers and automated quantification of tumor heterogeneity. Our novel THS correlates with MSI-status, indicating that with expanded training sets, translational tools could be developed that predict MSI-status using H&E-stained images alone.
Keywords: colon cancer; deep learning; microsatellite instability; tumor heterogeneity.
© 2022 Wiley Periodicals LLC.