Background: How to predict the cognitive performance of Alzheimer's disease (AD) and identify the informative neuroimaging markers is essential for timely treatment and possible delay of the disease. However, incomplete labeled samples and noises in neuroimaging data pose challenges to building reliable and robust prediction models. In this paper, we present a model named Low-rank Sparse Feature Selection with Incomplete Labels (LSFSIL) for predicting cognitive performance and identifying informative neuroimaging markers with MRI data and incomplete cognitive scores.
Method: We propose a sparse matrix decomposition method to decompose the incomplete cognitive score matrix into two parts for recovering missing scores and utilizing incomplete labeled data. The former is the recovered cognitive score matrix without missing values. To make the recovered scores close to the real ones, a manifold regularizer is devised to fit the label distribution for capturing the label correlations locally. The latter is a ℓ1-norm regularized matrix which represents the associated errors. Next, a low-rank regression model that regards the recovered matrix as the target is developed to increase the robustness to noises and outliers. Besides, ℓ2,1-norm is introduced into the objective function as a sparse regularization to identify the important features.
Results: Experimental results demonstrate that LSFSIL achieves higher performance and outperforms several state-of-the-art feature selection approaches. Moreover, the neuroimaging markers selected by LSFSIL are consistent with the previous AD studies.
Conclusions: LSFSIL is effective in informative neuroimaging marker identification for cognitive performance prediction with incomplete labeled data.
Keywords: Alzheimer's disease; Disease progression; Feature selection; Incomplete labeled data.
Copyright © 2022 Elsevier Ltd. All rights reserved.