High dimensional neuroimaging datasets and machine learning have been used to estimate and predict domain-specific cognition, but comparisons with simpler models composed of easy-to-measure variables are limited. Regularization methods in particular may help identify regions-of-interest related to domain-specific cognition. Using data from the Northern Manhattan Study, a cohort study of mostly Hispanic older adults, we compared three models estimating domain-specific cognitive performance: sociodemographics and APOE ε4 allele status (basic model), the basic model and MRI markers, and a model with only MRI markers. We used several machine learning methods to fit our regression models: elastic net, support vector regression, random forest, and principal components regression. Model performance was assessed with the RMSE, MAE, and R2 statistics using 5-fold cross-validation. To assess whether prediction models with imaging biomarkers were more predictive than prediction models built with randomly generated biomarkers, we refit the elastic net models using 1000 datasets with random biomarkers and compared the distribution of the RMSE and R2 in models using these random biomarkers to the RMSE and R2 from observed models. Basic models explained ~ 31-38% of the variance in domain-specific cognition. Addition of MRI markers did not improve estimation. However, elastic net models with only MRI markers performed significantly better than random MRI markers (one-sided P < .05) and yielded regions-of-interest consistent with previous literature and others not previously explored. Therefore, structural brain MRI markers may be more useful for etiological than predictive modeling.
Keywords: Biomarkers; Brain aging; Cognitive aging; Machine learning.
© 2020. Springer Science+Business Media, LLC, part of Springer Nature.