Objectives: To validate SAPS II-AM, a recently customized version of the Simplified Acute Physiology Score II (SAPS II) in a larger cohort of Austrian intensive care patients and to evaluate the effect of the customization process on the ratio of observed to expected mortality.
Design: Prospective, multicentric cohort study.
Patients and setting: A total of 2,901 patients consecutively admitted to 13 adult medical, surgical, and mixed intensive care units (ICUs) in Austria.
Measurements and results: After the database was divided randomly into a development sample (n = 1,450) and a validation sample (n = 1,451), logistic regression was used to develop a new model (SAPS II-AM2). The original SAPS II, the SAPS IIAM, and the newly developed SAPS II-AM2 were then compared by means of calibration, discrimination and O/E ratios. Differences in O/E ratios before and after customization (deltaO/E) were calculated. The Hosmer-Lemeshow goodness-of-fit H and C statistics revealed poor calibration of the original SAPS II on the database. The new model, SAPS II-AM2, performed better than the SAPS II-AM and excellent in the validation data set. However, mean O/E ratios varied widely among diagnostic categories (range 0.55-1.05 for the SAPS II). Moreover, the deltaO/E of the 13 ICUs ranged from -3.6 % to +25 %.
Conclusions: Today's severity scoring systems, such as the SAPS II, are limited by not measuring (and adjusting for) a profound part of what constitutes case mix. Changes in the distribution of patient characteristics (known and unknown) therefore affect prognostic accuracy. First-level customization was not able to solve all these problems. Using O/E ratios for quality of care comparisons one must therefore be critical when using these data and should search for possible confounding factors. In the case of unsatisfactory calibration, customized severity of illness models may be useful as an adjunct for quality control.