Does Incorporating a Measure of Clinical Workload Improve Workplace-Based Assessment Scores? Insights for Measurement Precision and Longitudinal Score Growth From Ten Pediatrics Residency Programs

Acad Med. 2018 Nov;93(11S Association of American Medical Colleges Learn Serve Lead: Proceedings of the 57th Annual Research in Medical Education Sessions):S21-S29. doi: 10.1097/ACM.0000000000002381.

Abstract

Purpose: This study investigates the impact of incorporating observer-reported workload into workplace-based assessment (WBA) scores on (1) psychometric characteristics of WBA scores and (2) measuring changes in performance over time using workload-unadjusted versus workload-adjusted scores.

Method: Structured clinical observations and multisource feedback instruments were used to collect WBA data from first-year pediatrics residents at 10 residency programs between July 2016 and June 2017. Observers completed items in 8 subcompetencies associated with Pediatrics Milestones. Faculty and resident observers assessed workload using a sliding scale ranging from low to high; all item scores were rescaled to a 1-5 scale to facilitate analysis and interpretation. Workload-adjusted WBA scores were calculated at the item level using three different approaches, and aggregated for analysis at the competency level. Mixed-effects regression models were used to estimate variance components. Longitudinal growth curve analyses examined patterns of developmental score change over time.

Results: On average, participating residents (n = 252) were assessed 5.32 times (standard deviation = 3.79) by different raters during the data collection period. Adjusting for workload yielded better discrimination of learner performance, and higher reliability, reducing measurement error by 28%. Projections in reliability indicated needing up to twice the number of raters when workload-unadjusted scores were used. Longitudinal analysis showed an increase in scores over time, with significant interaction between workload and time; workload also increased significantly over time.

Conclusions: Incorporating a measure of observer-reported workload could improve the measurement properties and the ability to interpret WBA scores.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Clinical Competence*
  • Educational Measurement
  • Humans
  • Internship and Residency*
  • Pediatrics / education*
  • Psychometrics
  • Workload*