Quality metrics are fundamental to value-based payment reforms. Because metrics are key components used to drive performance, health care organizations participating in payment reforms should consider metric reliability-a measure of true performance versus statistical "noise." This cross-sectional study examined reliability, variation from patient and clinician characteristics, and volume thresholds for 9 ambulatory quality metrics in a health system engaged in value-based payment reforms. Hierarchical mixed models were used to analyze data from 276 316 patients attributed to 4373 clinicians in 31 primary care clinics from 2015 to 2017. Reliability was lower for all metrics at the clinician level (range 6%-64%) than at the clinic level (84%-99%), with little variation related to patient or clinician characteristics. Few clinicians, but the majority of clinics, contributed sufficient volumes of patient encounters to meet a 70% reliability threshold. These findings suggest that clinic-level performance measurement may be more appropriate than individual clinician-level measurement, particularly in low-volume contexts.
Keywords: academic health systems; hierarchical regression models; payment systems design; performance measurement; quality improvement.