The purpose of this research was to examine the validity of the 55-item Revised Token Test (RTT) and to compare traditional and Rasch-based scores in their ability to detect group differences and change over time. The 55-item RTT was administered to 108 left- and right-hemisphere stroke survivors, and the data were submitted to Rasch analysis. Traditional and Rasch-based scores for a subsample of 60 stroke survivors were submitted to analyses of variance with group (left hemisphere with aphasia vs. right hemisphere) and time post onset (3 vs. 6 months post onset) as factors. The 2 scoring methods were compared using an index of relative precision. Forty-eight items demonstrated acceptable model fit. Misfitting items came primarily from Subtest IX. The Rasch model accounted for 71% of the variance in the responses to the remaining items. Intersubtest patterns of item difficulty were well predicted by item content, but unexpected within-subtest differences were found. Both traditional and Rasch person scores demonstrated significant group differences, but only the latter demonstrated statistically significant change over time. Analysis of relative precision, however, failed to confirm a significant difference between the 2 methods. The findings generally support the RTT's validity, but a minority of items appears to respond to a different construct. Also, within-subtest differences in item difficulty suggest the need for further examination of variability in impaired language performance. Finally, the results suggest an equivocal advantage for Rasch scores in detecting change over time.