Models of word reading that simultaneously take into account item-level and person-level fixed and random effects are broadly known as explanatory item response models (EIRM). Although many variants of the EIRM are available, the field has generally focused on the doubly explanatory model for modeling individual differences on item responses. Moreover, the historical application of the EIRM has been a Rasch version of the model where the item discrimination values are fixed at 1.0 and the random or fixed item effects only pertain to the item difficulties. The statistical literature has advanced to allow for more robust testing of observed or latent outcomes, as well as more flexible parameterizations of the EIRM. The purpose of the present study was to compare four types of Rasch-based EIRMs (i.e., doubly descriptive, person explanatory, item explanatory, doubly explanatory) and more broadly compare Rasch and 2PL EIRM when including person-level and item-level predictors. Results showed that not only was the error variance smaller in the unconditional 2PL EIRM compared to the Rasch EIRM due to including the item discrimination random effect, but that patterns of unique item-level explanatory variables differ between the two approaches. Results are interpreted within the context of what each statistical model affords to the opportunity for describing and explaining differences in word-level performance.
Keywords: 2pl IRT; Crossed random effects; Explanatory item response model; Reading.