Importance: Diagnostic acumen is a fundamental skill in the practice of medicine. Scalable, practical, and objective tools to assess diagnostic performance are lacking.
Objective: To validate a new method of assessing diagnostic performance that uses automated techniques to assess physicians' diagnostic performance on brief, open-ended case simulations.
Design, setting, and participants: Retrospective cohort study of 11 023 unique attempts to solve case simulations on an online software platform, The Human Diagnosis Project (Human Dx). A total of 1738 practicing physicians, residents (internal medicine, family medicine, and emergency medicine), and medical students throughout the United States voluntarily used Human Dx software between January 21, 2016, and January 15, 2017.
Main outcomes and measures: Internal structure validity was assessed by 3 measures of diagnostic performance: accuracy, efficiency, and a combined score (Diagnostic Acumen Precision Performance [DAPP]). These were each analyzed by level of training. Association with other variables' validity evidence was evaluated by correlating diagnostic performance and affiliation with an institution ranked in the top 25 medical schools by US News and World Report.
Results: Data were analyzed for 239 attending physicians, 926 resident physicians, 347 intern physicians, and 226 medical students. Attending physicians had higher mean accuracy scores than medical students (difference, 8.1; 95% CI, 4.2-12.0; P < .001), as did residents (difference, 8.0; 95% CI, 4.8-11.2; P < .001) and interns (difference, 5.9; 95% CI, 2.3-9.6; P < .001). Attending physicians had higher mean efficiency compared with residents (difference, 4.8; 95% CI, 1.8-7.8; P < .001), interns (difference, 5.0; 95% CI, 1.5-8.4; P = .001), and medical students (difference, 5.4; 95% CI, 1.4-9.3; P = .003). Attending physicians also had significantly higher mean DAPP scores than residents (difference, 2.6; 95% CI, 0.0-5.2; P = .05), interns (difference, 3.6; 95% CI, 0.6-6.6; P = .01), and medical students (difference, 6.7; 95% CI, 3.3-10.2; P < .001). Attending physicians affiliated with a US News and World Report-ranked institution had higher mean DAPP scores compared with nonaffiliated attending physicians (80 [95% CI, 77-83] vs 72 [95% CI, 70-74], respectively; P < .001). Resident physicians affiliated with an institution ranked in the top 25 medical schools by US News and World Report also had higher mean DAPP scores compared with nonaffiliated peers (75 [95% CI, 73-77] vs 71 [95% CI, 69-72], respectively; P < .001).
Conclusions and relevance: The data suggest that diagnostic performance is higher in those with more training and that DAPP scores may be a valid measure to appraise diagnostic performance. This diagnostic assessment tool allows individuals to receive immediate feedback on performance through an openly accessible online platform.