Explainable Tree-Based Predictions for Unplanned 30-Day Readmission of Patients With Cancer Using Clinical Embeddings

JCO Clin Cancer Inform. 2021 Feb:5:155-167. doi: 10.1200/CCI.20.00127.

Abstract

Purpose: Thirty-day unplanned readmission is one of the key components in measuring quality in patient care. Risk of readmission in oncology patients may be associated with a wide variety of specific factors including laboratory results and diagnoses, and it is hard to include all such features using traditional approaches such as one-hot encoding in predictive models.

Methods: We used clinical embeddings to represent complex medical concepts in lower dimensional spaces. For predictive modeling, we used gradient-boosted trees and adopted the shapley additive explanation framework to offer consistent individualized predictions. We used retrospective inpatient data between 2013 and 2018 with temporal split for training and testing.

Results: Our best performing model predicting readmission at discharge using clinical embeddings showed a testing area under receiver operating characteristic curve of 0.78 (95% CI, 0.77 to 0.80). Use of clinical embeddings led to up to 23.1% gain in area under precision-recall curve and 6% in area under receiver operating characteristic curve. Hematology models had more performance gain over surgery and medical oncology. Our study was the first to develop (1) explainable predictive models for the hematology population and (2) dynamic models to keep track of readmission risk throughout the duration of patient visit.

Conclusion: To our knowledge, our study was the first to develop (1) explainable predictive models for the hematology population and (2) dynamic models to keep track of readmission risk throughout the duration of patient visit.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Humans
  • Neoplasms* / therapy
  • Patient Discharge
  • Patient Readmission*
  • ROC Curve
  • Retrospective Studies