Refinement: Measuring informativeness of ratings in the absence of a gold standard

Br J Math Stat Psychol. 2022 Nov;75(3):593-615. doi: 10.1111/bmsp.12268. Epub 2022 Mar 16.

Abstract

We propose a new metric for evaluating the informativeness of a set of ratings from a single rater on a given scale. Such evaluations are of interest when raters rate numerous comparable items on the same scale, as occurs in hiring, college admissions, and peer review. Our exposition takes the context of peer review, which involves univariate and multivariate cardinal ratings. We draw on this context to motivate an information-theoretic measure of the refinement of a set of ratings - entropic refinement - as well as two secondary measures. A mathematical analysis of the three measures reveals that only the first, which captures the information content of the ratings, possesses properties appropriate to a refinement metric. Finally, we analyse refinement in real-world grant-review data, finding evidence that overall merit scores are more refined than criterion scores.

Keywords: decision-making; entropy; peer review; ratings.