With expanding potential clinical applications of functional magnetic resonance imaging (fMRI) it is important to test how reliable different measures of fMRI activation are between subjects and sessions and between centres. This study compared variability across 17 patients with multiple sclerosis (MS) and 22 age-matched healthy controls (HC) in 5 European centres performing an fMRI block design with hand tapping. We recruited subjects from sites using 1.5 T scanners from different manufacturers. 5 healthy volunteers also were studied at each of 4 of the centres. We found that reproducibility between runs and sessions for single individuals was consistently much greater than between individuals. There was greater run-to-run variability for MS patients than for HC. Measurements of maximum signal change (MSC) appeared to provide higher reproducibility within individuals and greater sensitivity to differences between individuals than region of interest (ROI) suprathreshold voxel counts. The variability in measurements between centres was not as great as that between individuals. Consistent with these observations, we estimated that power should not be reduced substantially with use of multi-, as opposed to single-, centre study designs with similar numbers of subjects. Multi-centre interventional studies in which fMRI is used as an outcome measure thus appear practical even when implemented in conventional clinical environments.