Bayesian risk prediction model for colorectal cancer mortality through integration of clinicopathologic and genomic data

NPJ Precis Oncol. 2023 Jun 10;7(1):57. doi: 10.1038/s41698-023-00406-8.

Abstract

Routine tumor-node-metastasis (TNM) staging of colorectal cancer is imperfect in predicting survival due to tumor pathobiological heterogeneity and imprecise assessment of tumor spread. We leveraged Bayesian additive regression trees (BART), a statistical learning technique, to comprehensively analyze patient-specific tumor characteristics for the improvement of prognostic prediction. Of 75 clinicopathologic, immune, microbial, and genomic variables in 815 stage II-III patients within two U.S.-wide prospective cohort studies, the BART risk model identified seven stable survival predictors. Risk stratifications (low risk, intermediate risk, and high risk) based on model-predicted survival were statistically significant (hazard ratios 0.19-0.45, vs. higher risk; P < 0.0001) and could be externally validated using The Cancer Genome Atlas (TCGA) data (P = 0.0004). BART demonstrated model flexibility, interpretability, and comparable or superior performance to other machine-learning models. Integrated bioinformatic analyses using BART with tumor-specific factors can robustly stratify colorectal cancer patients into prognostic groups and be readily applied to clinical oncology practice.