Background: A challenge for Problem-Based Learning (PBL) schools is to introduce reliable, valid, and cost-effective testing methods into the curriculum in such a way as to maximize the potential benefits of PBL while avoiding problems associated with assessment techniques like multiple-choice question, or MCQ, tests.
Purpose: We document the continued development of an exam that was designed to satisfy the demands of both PBL and the scientific principles of measurement.
Methods: A total of 102 medical students wrote a clinical reasoning exercise (CRE) as a requirement for two consecutive units of instruction. Each CRE consisted of a series of 18 short clinical problems designed to assess a student's knowledge of the mechanism of diseases that were covered in three subunits located within each unit. Responses were scored by a student's tutor and a 2nd crossover tutor.
Results: Generalizability coefficients for raters, subunits, and individual problems were low, but the reliability of the overall test scores and the reliability of the scores across 2 units of instruction were high. Subsequent analyses found that the crossover tutor's ratings were lower than the ratings provided by one's own tutor, and the CRE correlated with the biology component of a progress test.
Conclusion: The magnitude of the generalizability coefficients demonstrates that the CRE is capable of detecting differences in reasoning across knowledge domains and is therefore a useful evaluation tool.