International Validation of the SORG Machine-learning Algorithm for Predicting the Survival of Patients with Extremity Metastases Undergoing Surgical Treatment

Clin Orthop Relat Res. 2022 Feb 1;480(2):367-378. doi: 10.1097/CORR.0000000000001969.

Abstract

Background: The Skeletal Oncology Research Group machine-learning algorithms (SORG-MLAs) estimate 90-day and 1-year survival in patients with long-bone metastases undergoing surgical treatment and have demonstrated good discriminatory ability on internal validation. However, the performance of a prediction model could potentially vary by race or region, and the SORG-MLA must be externally validated in an Asian cohort. Furthermore, the authors of the original developmental study did not consider the Eastern Cooperative Oncology Group (ECOG) performance status, a survival prognosticator repeatedly validated in other studies, in their algorithms because of missing data.

Questions/purposes: (1) Is the SORG-MLA generalizable to Taiwanese patients for predicting 90-day and 1-year mortality? (2) Is the ECOG score an independent factor associated with 90-day and 1-year mortality while controlling for SORG-MLA predictions?

Methods: All 356 patients who underwent surgery for long-bone metastases between 2014 and 2019 at one tertiary care center in Taiwan were included. Ninety-eight percent (349 of 356) of patients were of Han Chinese descent. The median (range) patient age was 61 years (25 to 95), 52% (184 of 356) were women, and the median BMI was 23 kg/m2 (13 to 39 kg/m2). The most common primary tumors were lung cancer (33% [116 of 356]) and breast cancer (16% [58 of 356]). Fifty-five percent (195 of 356) of patients presented with a complete pathologic fracture. Intramedullary nailing was the most commonly performed type of surgery (59% [210 of 356]), followed by plate screw fixation (23% [81 of 356]) and endoprosthetic reconstruction (18% [65 of 356]). Six patients were lost to follow-up within 90 days; 30 were lost to follow-up within 1 year. Eighty-five percent (301 of 356) of patients were followed until death or for at least 2 years. Survival was 82% (287 of 350) at 90 days and 49% (159 of 326) at 1 year. The model's performance metrics included discrimination (concordance index [c-index]), calibration (intercept and slope), and Brier score. In general, a c-index of 0.5 indicates random guess and a c-index of 0.8 denotes excellent discrimination. Calibration refers to the agreement between the predicted outcomes and the actual outcomes, with a perfect calibration having an intercept of 0 and a slope of 1. The Brier score of a prediction model must be compared with and ideally should be smaller than the score of the null model. A decision curve analysis was then performed for the 90-day and 1-year prediction models to evaluate their net benefit across a range of different threshold probabilities. A multivariate logistic regression analysis was used to evaluate whether the ECOG score was an independent prognosticator while controlling for the SORG-MLA's predictions. We did not perform retraining/recalibration because we were not trying to update the SORG-MLA algorithm in this study.

Results: The SORG-MLA had good discriminatory ability at both timepoints, with a c-index of 0.80 (95% confidence interval 0.74 to 0.86) for 90-day survival prediction and a c-index of 0.84 (95% CI 0.80 to 0.89) for 1-year survival prediction. However, the calibration analysis showed that the SORG-MLAs tended to underestimate Taiwanese patients' survival (90-day survival prediction: calibration intercept 0.78 [95% CI 0.46 to 1.10], calibration slope 0.74 [95% CI 0.53 to 0.96]; 1-year survival prediction: calibration intercept 0.75 [95% CI 0.49 to 1.00], calibration slope 1.22 [95% CI 0.95 to 1.49]). The Brier score of the 90-day and 1-year SORG-MLA prediction models was lower than their respective null model (0.12 versus 0.16 for 90-day prediction; 0.16 versus 0.25 for 1-year prediction), indicating good overall performance of SORG-MLAs at these two timepoints. Decision curve analysis showed SORG-MLAs provided net benefits when threshold probabilities ranged from 0.40 to 0.95 for 90-day survival prediction and from 0.15 to 1.0 for 1-year prediction. The ECOG score was an independent factor associated with 90-day mortality (odds ratio 1.94 [95% CI 1.01 to 3.73]) but not 1-year mortality (OR 1.07 [95% CI 0.53 to 2.17]) after controlling for SORG-MLA predictions for 90-day and 1-year survival, respectively.

Conclusion: SORG-MLAs retained good discriminatory ability in Taiwanese patients with long-bone metastases, although their actual survival time was slightly underestimated. More international validation and incremental value studies that address factors such as the ECOG score are warranted to refine the algorithms, which can be freely accessed online at https://sorg-apps.shinyapps.io/extremitymetssurvival/.

Level of evidence: Level III, therapeutic study.

Publication types

  • Validation Study

MeSH terms

  • Adult
  • Aged
  • Aged, 80 and over
  • Bone Neoplasms / mortality*
  • Bone Neoplasms / secondary*
  • Bone Neoplasms / surgery
  • Extremities / pathology
  • Extremities / surgery
  • Female
  • Humans
  • Machine Learning*
  • Male
  • Middle Aged
  • Postoperative Period
  • Predictive Value of Tests
  • Prognosis
  • Taiwan