Comparing the stability and reproducibility of brain-behavior relationships found using canonical correlation analysis and partial least squares within the ABCD sample

Netw Neurosci. 2024 Jul 1;8(2):576-596. doi: 10.1162/netn_a_00363. eCollection 2024.

Abstract

Canonical correlation analysis (CCA) and partial least squares correlation (PLS) detect linear associations between two data matrices by computing latent variables (LVs) having maximal correlation (CCA) or covariance (PLS). This study compared the similarity and generalizability of CCA- and PLS-derived brain-behavior relationships. Data were accessed from the baseline Adolescent Brain Cognitive Development (ABCD) dataset (N > 9,000, 9-11 years). The brain matrix consisted of cortical thickness estimates from the Desikan-Killiany atlas. Two phenotypic scales were examined separately as the behavioral matrix; the Child Behavioral Checklist (CBCL) subscale scores and NIH Toolbox performance scores. Resampling methods were used to assess significance and generalizability of LVs. LV1 for the CBCL brain relationships was found to be significant, yet not consistently stable or reproducible, across CCA and PLS models (singular value: CCA = .13, PLS = .39, p < .001). LV1 for the NIH brain relationships showed similar relationships between CCA and PLS and was found to be stable and reproducible (singular value: CCA = .21, PLS = .43, p < .001). The current study suggests that stability and reproducibility of brain-behavior relationships identified by CCA and PLS are influenced by the statistical characteristics of the phenotypic measure used when applied to a large population-based pediatric sample.

Keywords: Brain-behavior relationships; Cortical thickness; Multivariate modeling; Population-based samples.

Plain language summary

Clinical neuroscience research is going through a translational crisis largely due to the challenges of producing meaningful and generalizable results. Two critical limitations within clinical neuroscience research are the use of univariate statistics and between-study methodological variation. Univariate statistics may not be sensitive enough to detect complex relationships between several variables, and methodological variation poses challenges to the generalizability of the results. We compared two widely used multivariate statistical approaches, canonical correlations analysis (CCA) and partial least squares correlation (PLS), to determine the generalizability and stability of their solutions. We show that the properties of the measures inputted into the analysis likely play a more substantial role in the generalizability and stability of results compared to the specific approach applied (i.e., CCA or PLS).