Lung cancer is the first leading cause of cancer-related death in the United States, with lung adenocarcinoma as the major subtype accounting for 40% of all cases. To improve patient survival, image-based prognostic models were developed due to the ready availability of pathological images at diagnosis. However, the application of these models is hampered by two main challenges: the lack of publicly available image datasets with high-quality survival information and the poor interpretability of conventional convolutional neural network models. Here, we integrated matched transcriptomic and H&E staining data from TCGA (The Cancer Genome Atlas) to develop an image-based prognostic model, termed Deep-learning based Cell Graph (DeepCG) model. Instead of survival data, we used a gene signature to predict patient prognostic risks, which was then used as labels for training DeepCG. Importantly, by employing graph structures to capture cell patterns, DeepCG can provide cell-level interpretation, which was more biologically relevant than previous region-level insights. We validated the prognostic values of DeepCG in independent datasets and demonstrated its ability to identify prognostically informative cells in images.
Keywords: H&E; graph neural network; lung adenocarcinoma; prognosis.
© 2024 UICC.