Statistical models incorporating cluster-specific intercepts are commonly used in hierarchical settings, for example, observations clustered within patients or patients clustered within hospitals. Predicted values of these intercepts are often used to identify or "flag" extreme or outlying clusters, such as poorly performing hospitals or patients with rapid declines in their health. We consider a variety of flagging rules, assessing different predictors, and using different accuracy measures. Using theoretical calculations and comprehensive numerical evaluation, we show that previously proposed rules based on the 2 most commonly used predictors, the usual best linear unbiased predictor and fixed effects predictor, perform extremely poorly: the incorrect flagging rates are either unacceptably high (approaching 0.5 in the limit) or overly conservative (eg, much <0.05 for reasonable parameter values, leading to very low correct flagging rates). We develop novel methods for flagging extreme clusters that can control the incorrect flagging rates, including very simple-to-use versions that we call "self-calibrated." The new methods have substantially higher correct flagging rates than previously proposed methods for flagging extreme values, while controlling the incorrect flagging rates. We illustrate their application using data on length of stay in pediatric hospitals for children admitted for asthma diagnoses.
Keywords: hierarchical model; predicted random effects; profiling; weighted prediction.
© The Author(s) 2024. Published by Oxford University Press on behalf of The International Biometric Society.