The discrepancy in the literature concerning the sensitivity of delayed P300 latency in dementia revolves around the variability of P300 latency in normal elderly. Previous studies, which have produced conflicting results, have utilized different P300 paradigms and different methods for measuring P300 latency. We demonstrate that the reported differences in the variability of P300 latency in normal elderly are likely to have resulted from these differences in paradigm and measurement method. We gathered auditory P300 data from a sample of 50 normal elderly using both a 3-tone RT paradigm and a 2-tone oddball paradigm and measured P300 latency using both peak-picking on the average EP and a single-trial template-fitting procedure. P300 latency variability was increased for the 3-tone paradigm compared to the auditory oddball paradigm, and for the template-fitting method. compared to peak-picking. The increased P300 latency variability when the template-fitting method was employed resulted, in part, from the template-fitting procedure focussing on random noise when the signal-to-noise ratio was low. Increased variability of P300 latency in the 3-tone paradigm resulted from 13 subjects who had a very long-latency positive component (often in addition to an earlier, smaller component) in this paradigm that met the scoring criteria for P300, but that was not seen in the oddball paradigm. We speculate that for the 3-tone task, some subjects may adopt a strategy involving 2 stages of stimulus processing, which results in the generation of 2 positive components, obscuring group differences in P300 latency. These results raise questions concerning the use, in clinical studies designed to yield diagnostic measures, of complicated paradigms in which subjects might adopt varying processing strategies.