Video-Based Communication Assessment of Physician Error Disclosure Skills by Crowdsourced Laypeople and Patient Advocates Who Experienced Medical Harm: Reliability Assessment With Generalizability Theory

Andrew A White; Ann M King; Angelo E D'Addario; Karen Berg Brigham; Suzanne Dintzis; Emily E Fay; Thomas H Gallagher; Kathleen M Mazor

doi:10.2196/30988

Video-Based Communication Assessment of Physician Error Disclosure Skills by Crowdsourced Laypeople and Patient Advocates Who Experienced Medical Harm: Reliability Assessment With Generalizability Theory

JMIR Med Educ. 2022 Apr 29;8(2):e30988. doi: 10.2196/30988.

Authors

Andrew A White¹, Ann M King², Angelo E D'Addario², Karen Berg Brigham³, Suzanne Dintzis⁴, Emily E Fay⁵, Thomas H Gallagher^{1

6}, Kathleen M Mazor⁷

Affiliations

¹ Department of Medicine, University of Washington School of Medicine, Seattle, WA, United States.
² National Board of Medical Examiners, Philadelphia, PA, United States.
³ Collaborative for Accountability and Improvement, University of Washington, Seattle, WA, United States.
⁴ Department of Pathology, University of Washington School of Medicine, Seattle, WA, United States.
⁵ Department of Obstetrics and Gynecology, University of Washington School of Medicine, Seattle, WA, United States.
⁶ Department of Bioethics and Humanities, University of Washington, Seattle, WA, United States.
⁷ Meyers Primary Care Institute, University of Massachusetts Medical School, Worcester, MA, United States.

PMID: 35486423
PMCID: PMC9107044
DOI: 10.2196/30988

Abstract

Background: Residents may benefit from simulated practice with personalized feedback to prepare for high-stakes disclosure conversations with patients after harmful errors and to meet American Council on Graduate Medical Education mandates. Ideally, feedback would come from patients who have experienced communication after medical harm, but medical researchers and leaders have found it difficult to reach this community, which has made this approach impractical at scale. The Video-Based Communication Assessment app is designed to engage crowdsourced laypeople to rate physician communication skills but has not been evaluated for use with medical harm scenarios.

Objective: We aimed to compare the reliability of 2 assessment groups (crowdsourced laypeople and patient advocates) in rating physician error disclosure communication skills using the Video-Based Communication Assessment app.

Methods: Internal medicine residents used the Video-Based Communication Assessment app; the case, which consisted of 3 sequential vignettes, depicted a delayed diagnosis of breast cancer. Panels of patient advocates who have experienced harmful medical error, either personally or through a family member, and crowdsourced laypeople used a 5-point scale to rate the residents' error disclosure communication skills (6 items) based on audiorecorded responses. Ratings were aggregated across items and vignettes to create a numerical communication score for each physician. We used analysis of variance, to compare stringency, and Pearson correlation between patient advocates and laypeople, to identify whether rank order would be preserved between groups. We used generalizability theory to examine the difference in assessment reliability between patient advocates and laypeople.

Results: Internal medicine residents (n=20) used the Video-Based Communication Assessment app. All patient advocates (n=8) and 42 of 59 crowdsourced laypeople who had been recruited provided complete, high-quality ratings. Patient advocates rated communication more stringently than crowdsourced laypeople (patient advocates: mean 3.19, SD 0.55; laypeople: mean 3.55, SD 0.40; P<.001), but patient advocates' and crowdsourced laypeople's ratings of physicians were highly correlated (r=0.82, P<.001). Reliability for 8 raters and 6 vignettes was acceptable (patient advocates: G coefficient 0.82; crowdsourced laypeople: G coefficient 0.65). Decision studies estimated that 12 crowdsourced layperson raters and 9 vignettes would yield an acceptable G coefficient of 0.75.

Conclusions: Crowdsourced laypeople may represent a sustainable source of reliable assessments of physician error disclosure skills. For a simulated case involving delayed diagnosis of breast cancer, laypeople correctly identified high and low performers. However, at least 12 raters and 9 vignettes are required to ensure adequate reliability and future studies are warranted. Crowdsourced laypeople rate less stringently than raters who have experienced harm. Future research should examine the value of the Video-Based Communication Assessment app for formative assessment, summative assessment, and just-in-time coaching of error disclosure communication skills.

Keywords: communication; communication assessment; crowdsourcing; generalizability theory; graduate medical education; medical education; medical error; medical error disclosure; patient-centered care; simulation studies.

©Andrew A White, Ann M King, Angelo E D’Addario, Karen Berg Brigham, Suzanne Dintzis, Emily E Fay, Thomas H Gallagher, Kathleen M Mazor. Originally published in JMIR Medical Education (https://mededu.jmir.org), 29.04.2022.