Item response theory-based continuous test norming

Hannah M Heister; Casper J Albers; Marie Wiberg; Marieke E Timmerman

doi:10.1037/met0000686

Item response theory-based continuous test norming

Psychol Methods. 2024 Oct 14. doi: 10.1037/met0000686. Online ahead of print.

Authors

Hannah M Heister¹, Casper J Albers¹, Marie Wiberg², Marieke E Timmerman¹

Affiliations

¹ Department Psychometrics and Statistics, University of Groningen.
² Department of Statistics, USBE, Umea University.

PMID: 39404820
DOI: 10.1037/met0000686

Abstract

In norm-referenced psychological testing, an individual's performance is expressed in relation to a reference population using a standardized score, like an intelligence quotient score. The reference population can depend on a continuous variable, like age. Current continuous norming methods transform the raw score into an age-dependent standardized score. Such methods have the shortcoming to solely rely on the raw test scores, ignoring valuable information from individual item responses. Instead of modeling the raw test scores, we propose modeling the item scores with a Bayesian two-parameter logistic (2PL) item response theory model with age-dependent mean and variance of the latent trait distribution, 2PL-norm for short. Norms are then derived using the estimated latent trait score and the age-dependent distribution parameters. Simulations show that 2PL-norms are overall more accurate than those from the most popular raw score-based norming methods cNORM and generalized additive models for location, scale, and shape (GAMLSS). Furthermore, the credible intervals of 2PL-norm exhibit clearly superior coverage over the confidence intervals of the raw score-based methods. The only issue of 2PL-norm is its slightly lower performance at the tails of the norms. Among the raw score-based norming methods, GAMLSS outperforms cNORM. For empirical practice this suggests the use of 2PL-norm, if the model assumptions hold. If not, or the interest is solely in the point estimates of the extreme trait positions, GAMLSS-based norming is a better alternative. The use of the 2PL-norm is illustrated and compared with GAMLSS and cNORM using empirical data, and code is provided, so that users can readily apply 2PL-norm to their normative data. (PsycInfo Database Record (c) 2024 APA, all rights reserved).

Abstract

Grants and funding