Background: Surgical outcome prediction models are useful for many purposes, including informed consent, shared decision making, preoperative mitigation of modifiable risk, and risk-adjusted quality measures. The recently reported Surgical Risk Preoperative Assessment System (SURPAS) universal risk calculators were developed using 2005-2012 American College of Surgeons National Surgical Quality Improvement Program (ACS-NSQIP), and they demonstrated excellent overall and specialty-specific performance. However, surgeons must assess whether universal calculators are accurate for the small subset of procedures they perform. To our knowledge, SURPAS has not been tested in a subset of patients undergoing lower-extremity total joint arthroplasty (TJA).
Questions/purposes: How accurate are SURPAS models' predictions for patients undergoing TJA?
Methods: We identified an internal subset of patients undergoing non-emergency THA or TKA from the 2012 ACS-NSQIP, the most recent year of the SURPAS development dataset. To assess the accuracy of SURPAS prediction models, 30-day postoperative outcomes were defined as in the original SURPAS study: mortality, overall morbidity, and six complication clusters-pulmonary, infectious, cardiac or transfusion, renal, venous thromboembolic, and neurologic. We calculated predicted outcome probabilities by applying coefficients from the published SURPAS logistic regression models to the TJA cohort. Discrimination was assessed with C-indexes, and calibration was assessed with Hosmer-Lemeshow 10-group chi-square tests and decile plots.
Results: The 30-day postoperative mortality rate for TJA was 0.1%, substantially lower than the 1% mortality rate in the SURPAS development dataset. The most common postoperative complications for TJA were intraoperative or postoperative transfusion (16%), urinary tract infection (5%), and vein thrombosis (3%). The C-indexes for joint arthroplasty ranged from 0.56 for venous thromboembolism (95% CI 0.53 to 0.59 versus SURPAS C-index 0.78) to 0.82 for mortality (95% CI 0.76 to 0.88 versus SURPAS C-index 0.94). All joint arthroplasty C-index estimates, including CIs, were lower than those reported in the original SURPAS development study. Decile plots and Hosmer-Lemeshow tests indicated poor calibration. Observed mortality rates were lower than expected for patients in all risk deciles (lowest decile: no observed deaths, 0.0% versus expected 0.1%; highest decile: observed mortality 0.7% versus expected 2%; p < 0.001). Conversely, observed morbidity rates were higher than expected across all risk deciles (lowest decile: observed 12% versus expected 8%; highest decile: observed morbidity 32% versus expected 25%; p < 0.001) CONCLUSIONS: The universal SURPAS risk models have lower accuracy for TJA procedures than they do for the wider range of procedures in which the SURPAS models were originally developed.
Clinical relevance: These results suggest that SURPAS model estimates must be evaluated for individual surgical procedures or within restricted groups of related procedures such as joint arthroplasty. Given substantial variation in patient populations and outcomes across numerous surgical procedures, universal perioperative risk calculators may not produce accurate and reliable results for specific procedures. Surgeons and healthcare administrators should use risk calculators developed and validated for specific procedures most relevant to each decision. Continued work is needed to assess the accuracy of universal risk calculators in more narrow procedural categories based on similarity of outcome event rates and prevalence of predictive variables across procedures.