Limitations in the inter-observer reliability of EuroSCORE: what should change in EuroSCORE II?

Eur J Cardiothorac Surg. 2011 Dec;40(6):1304-8. doi: 10.1016/j.ejcts.2011.02.067. Epub 2011 Apr 15.

Abstract

Objectives: To carry out an in-depth single-centre analysis of the inter-observer reliability of the EuroSCORE (European System for Cardiac Operative Risk Evaluation) to propose changes for the EuroSCORE II.

Methods: Data for the EuroSCORE additive and logistic models were prospectively collected by surgeons (computer-assisted calculation) (SurgAE and SurgLE) and perfusionists (on A4 data collection forms; PerfAE) for 1719 consecutive adult heart operations. The performance of the EuroSCORE was first analysed, then inter-observer discrepancies in the score were assessed globally and for any of its 17 risk factors.

Results: Hospital mortality was 4.3% (SurgAE and SurgLE: 5.3 and 7.3, respectively). The predictive ability and the calibration of the score were acceptable (area under the receiver operating characteristics curve: 0.75 for SurgAE and 0.753 for SurgLE, p = 0.98, Hosmer and Lemeshow goodness-of-fit test). Overall inter-observer concordance was satisfactory (Kappa coefficient: 0.71) but SurgAE and PerfAE were different in 26.3% of cases (SurgAE>PerfAE in 18.6%, and PerfAE>SurgAE in 7.7%). Five of the 17 risk factors accounted for most of the variability: left-ventricular ejection fraction, extracardiac arteriopathy, surgery other than isolated coronary artery bypass graft, recent myocardial infarction and pulmonary hypertension (with discrepancies respectively noticed in 7.6%, 5.3%, 5%, 3.9% and 3% of cases). Encoding mismatches for EuroSCORE items have been either assigned to human errors related to interpretation or conflicting information in the charts. Both situations may reflect structural weaknesses of the EuroSCORE.

Conclusions: The EuroSCORE is a widely used score, but its predictive power and reliability are declining due to changes in cardiac surgery case mix and outcomes in recent years. The present work highlights the fact that the encoding system in the EuroSCORE still gives room for interpretation. Along with other possible modifications described elsewhere, it is suggested that reliability and predicting ability of the score might be increased by changes in some definitions of risk factors and by the use of numeric values instead of intervals of values.

MeSH terms

  • Adult
  • Biomarkers / blood
  • Cardiac Surgical Procedures / adverse effects*
  • Comorbidity
  • Creatinine / blood
  • Epidemiologic Methods
  • Female
  • Humans
  • Male
  • Middle Aged
  • Myocardial Infarction / complications
  • Preoperative Care / methods
  • Reoperation
  • Risk Assessment / methods
  • Severity of Illness Index*

Substances

  • Biomarkers
  • Creatinine