Reproducibility of neuroimaging analyses across operating systems

Tristan Glatard; Lindsay B Lewis; Rafael Ferreira da Silva; Reza Adalat; Natacha Beck; Claude Lepage; Pierre Rioux; Marc-Etienne Rousseau; Tarek Sherif; Ewa Deelman; Najmeh Khalili-Mahani; Alan C Evans

doi:10.3389/fninf.2015.00012

Reproducibility of neuroimaging analyses across operating systems

Front Neuroinform. 2015 Apr 24:9:12. doi: 10.3389/fninf.2015.00012. eCollection 2015.

Affiliations

¹ McConnell Brain Imaging Centre, Montreal Neurological Institute, McGill University Montreal, QC, Canada ; Centre National de la Recherche Scientifique, University of Lyon, INSERM, CREATIS Villeurbanne, France.
² McConnell Brain Imaging Centre, Montreal Neurological Institute, McGill University Montreal, QC, Canada.
³ Information Sciences Institute, University of Southern California Marina del Rey, CA, USA.

Abstract

Neuroimaging pipelines are known to generate different results depending on the computing platform where they are compiled and executed. We quantify these differences for brain tissue classification, fMRI analysis, and cortical thickness (CT) extraction, using three of the main neuroimaging packages (FSL, Freesurfer and CIVET) and different versions of GNU/Linux. We also identify some causes of these differences using library and system call interception. We find that these packages use mathematical functions based on single-precision floating-point arithmetic whose implementations in operating systems continue to evolve. While these differences have little or no impact on simple analysis pipelines such as brain extraction and cortical tissue classification, their accumulation creates important differences in longer pipelines such as subcortical tissue classification, fMRI analysis, and cortical thickness extraction. With FSL, most Dice coefficients between subcortical classifications obtained on different operating systems remain above 0.9, but values as low as 0.59 are observed. Independent component analyses (ICA) of fMRI data differ between operating systems in one third of the tested subjects, due to differences in motion correction. With Freesurfer and CIVET, in some brain regions we find an effect of build or operating system on cortical thickness. A first step to correct these reproducibility issues would be to use more precise representations of floating-point numbers in the critical sections of the pipelines. The numerical stability of pipelines should also be reviewed.

Keywords: CIVET; FSL; Freesurfer; operating systems; reproducibility.

Grants and funding

U01 AG024904/AG/NIA NIH HHS/United States