Objective: Analyses of individual differences in change may be unintentionally biased when versions of a neuropsychological test used at different follow-ups are not of equivalent difficulty. This study's objective was to compare mean, linear, and equipercentile equating methods and demonstrate their utility in longitudinal research.
Study design and setting: The Advanced Cognitive Training for Independent and Vital Elderly (ACTIVE, N = 1,401) study is a longitudinal randomized trial of cognitive training. The Alzheimer's Disease Neuroimaging Initiative (ADNI, n = 819) is an observational cohort study. Nonequivalent alternate versions of the Auditory Verbal Learning Test (AVLT) were administered in both studies.
Results: Using visual displays, raw and mean-equated AVLT scores in both studies showed obvious nonlinear trajectories in reference groups that should show minimal change and poor equivalence over time (ps ≤ .001), and raw scores demonstrated poor fits in models of within-person change (root mean square errors of approximation, RMSEAs > 0.12). Linear and equipercentile equating produced more similar means in reference groups (ps ≥ .09) and performed better in growth models (RMSEAs < 0.05).
Conclusion: Equipercentile equating is the preferred equating method because it accommodates tests more difficult than a reference test at different percentiles of performance and performs well in models of within-person trajectory. The method has broad applications in both clinical and research settings to enhance the ability to use nonequivalent test forms.