The Rater Applied Performance Scale: development and reliability

Psychiatry Res. 2004 Jun 30;127(1-2):147-55. doi: 10.1016/j.psychres.2004.03.001.

Abstract

Previous studies of rater performance and interrater reliability have used passive scoring tasks such as rating patients from a videotaped interview. Little is known, however, about how well raters conduct assessments on real patients or how reliably they apply scoring criteria during actual assessment sessions. With growing recognition of the importance of monitoring and review of actual evaluation sessions, there is need for a systematic approach to quantify raters' applied performance. The Rater Applied Performance Scale (RAPS) measures six dimensions of rater performance (adherence, follow-up, clarification, neutrality, rapport, and accuracy) based on reviews of audiotaped or videotaped assessment sessions or on live monitoring of assessment sessions. We tested this new scale by having two reviewers rate 20 Hamilton Depression Scale rating sessions ascertained from a multi-site depression trial. We found good internal consistency for the RAPS. Interrater (i.e. inter-reviewer) reliability was satisfactory for RAPS total score ratings. In addition, RAPS ratings correlated with quantitative measures of scoring accuracy based on independent expert ratings. Preliminary psychometric data suggest that the RAPS may be a valuable tool for quantifying the performance of clinical raters. Potential applications of the RAPS are considered.

Publication types

  • Clinical Trial
  • Multicenter Study
  • Randomized Controlled Trial
  • Research Support, U.S. Gov't, P.H.S.
  • Validation Study

MeSH terms

  • Depressive Disorder, Major / diagnosis*
  • Depressive Disorder, Major / epidemiology*
  • Follow-Up Studies
  • Humans
  • Observer Variation
  • Psychometrics / methods
  • Reproducibility of Results
  • Surveys and Questionnaires*
  • Videotape Recording