The current study explored the extent to which variations in self-report measures across studies can produce differences in the results obtained from mixture models. Data (N = 854) come from a laboratory analogue study of methods for creating commensurate scores of alcohol- and substance-use-related constructs when items differ systematically across participants for any given measure. Items were manipulated according to four conditions, corresponding to increasing levels of alteration to item stems, response options, or both. In Study 1, results from latent class analyses (LCA) of alcohol consequences were compared across the four conditions, revealing differences in class enumeration and configuration. In Study 2, results from factor mixture models (FMM) of alcohol expectancies were compared across two of the conditions, revealing differences in patterns and magnitude of the factor loadings and thresholds. The results suggest that even subtle differences in measurement can have substantively meaningful effects on mixture model results.