Background: To avoid biased estimates of standard errors in regression models, statisticians commonly limit the analytical dataset to one observation per patient.
Objective: Measure and explain changes in model performance when a model predicting 30-day risk of death or urgent readmission (derived on a dataset having one hospitalization per patient) was applied to all hospitalizations for study patients.
Methods: Using administrative data from Ontario, we identified all hospitalizations of 499,996 patients between 2004 and 2009. We calculated the expected risk for 30-day death or urgent readmission using a validated model. The observed-to-expected ratio was determined after categorizing patients into quintiles of rates for hospitalization, emergent hospitalizations, hospital day and total diagnostic risk score.
Results: Study patients had a total of 858,410 hospitalizations. Compared with a dataset having one hospitalization per patient, model performance declined significantly when applied to all hospitalizations [c-statistic decreased from 0.768 to 0.730; the observed-to-expected ratio increased from 0.998 (95% confidence interval 0.977-0.999) to 1.305 (1.297-1.313)]. Model deterioration was most pronounced in patients with higher hospital utilization, with the observed-to-expected ratio increasing to 1.67 in the highest quintile of emergent hospitalization rates.
Conclusions: The accuracy of predicting 30-day death or urgent readmission decreased significantly when the unit of analysis changed from the patient to the hospitalization. Patients with heavy hospital utilization likely have characteristics, not adequately captured in the model, that increase the risk of death or urgent readmission after discharge from hospital. Adequately capturing the characteristics of such high-end hospital users may improve readmission models.
Keywords: hospital readmission; model performance; multivariate regression.
© 2012 John Wiley & Sons Ltd.