Previous studies of rater performance and interrater reliability have used passive scoring tasks such as rating patients from a videotaped interview. Little is known, however, about how well raters conduct assessments on real patients or how reliably they apply scoring criteria during actual assessment sessions. With growing recognition of the importance of monitoring and review of actual evaluation sessions, there is need for a systematic approach to quantify raters' applied performance. The Rater Applied Performance Scale (RAPS) measures six dimensions of rater performance (adherence, follow-up, clarification, neutrality, rapport, and accuracy) based on reviews of audiotaped or videotaped assessment sessions or on live monitoring of assessment sessions. We tested this new scale by having two reviewers rate 20 Hamilton Depression Scale rating sessions ascertained from a multi-site depression trial. We found good internal consistency for the RAPS. Interrater (i.e. inter-reviewer) reliability was satisfactory for RAPS total score ratings. In addition, RAPS ratings correlated with quantitative measures of scoring accuracy based on independent expert ratings. Preliminary psychometric data suggest that the RAPS may be a valuable tool for quantifying the performance of clinical raters. Potential applications of the RAPS are considered.
Copyright 2004 Elsevier Ireland Ltd.