The advances in technologies for acquiring brain imaging and high-throughput genetic data allow the researcher to access a large amount of multi-modal data. Although the sparse canonical correlation analysis is a powerful bi-multivariate association analysis technique for feature selection, we are still facing major challenges in integrating multi-modal imaging genetic data and yielding biologically meaningful interpretation of imaging genetic findings. In this study, we propose a novel multi-task learning based structured sparse canonical correlation analysis (MTS2CCA) to deliver interpretable results and improve integration in imaging genetics studies. We perform comparative studies with state-of-the-art competing methods on both simulation and real imaging genetic data. On the simulation data, our proposed model has achieved the best performance in terms of canonical correlation coefficients, estimation accuracy, and feature selection accuracy. On the real imaging genetic data, our proposed model has revealed promising features of single-nucleotide polymorphisms and brain regions related to sleep. The identified features can be used to improve clinical score prediction using promising imaging genetic biomarkers. An interesting future direction is to apply our model to additional neurological or psychiatric cohorts such as patients with Alzheimer's or Parkinson's disease to demonstrate the generalizability of our method.
Keywords: Brain imaging genetics; Multi-task learning; Outcome prediction; Sparse canonical correlation analysis.
Copyright © 2021 Elsevier B.V. All rights reserved.