Predicting Alzheimer's disease CSF core biomarkers: a multimodal Machine Learning approach

Front Aging Neurosci. 2024 Jun 26:16:1369545. doi: 10.3389/fnagi.2024.1369545. eCollection 2024.

Abstract

Introduction: Alzheimer's disease (AD) is a progressive neurodegenerative disorder. Current core cerebrospinal fluid (CSF) AD biomarkers, widely employed for diagnosis, require a lumbar puncture to be performed, making them impractical as screening tools. Considering the role of sleep disturbances in AD, recent research suggests quantitative sleep electroencephalography features as potential non-invasive biomarkers of AD pathology. However, quantitative analysis of comprehensive polysomnography (PSG) signals remains relatively understudied. PSG is a non-invasive test enabling qualitative and quantitative analysis of a wide range of parameters, offering additional insights alongside other biomarkers. Machine Learning (ML) gained interest for its ability to discern intricate patterns within complex datasets, offering promise in AD neuropathology detection. Therefore, this study aims to evaluate the effectiveness of a multimodal ML approach in predicting core AD CSF biomarkers.

Methods: Mild-moderate AD patients were prospectively recruited for PSG, followed by testing of CSF and blood samples for biomarkers. PSG signals underwent preprocessing to extract non-linear, time domain and frequency domain statistics quantitative features. Multiple ML algorithms were trained using four subsets of input features: clinical variables (CLINVAR), conventional PSG parameters (SLEEPVAR), quantitative PSG signal features (PSGVAR) and a combination of all subsets (ALL). Cross-validation techniques were employed to evaluate model performance and ensure generalizability. Regression models were developed to determine the most effective variable combinations for explaining variance in the biomarkers.

Results: On 49 subjects, Gradient Boosting Regressors achieved the best results in estimating biomarkers levels, using different loss functions for each biomarker: least absolute deviation (LAD) for the Aβ42, least squares (LS) for p-tau and Huber for t-tau. The ALL subset demonstrated the lowest training errors for all three biomarkers, albeit with varying test performance. Specifically, the SLEEPVAR subset yielded the best test performance in predicting Aβ42, while the ALL subset most accurately predicted p-tau and t-tau due to the lowest test errors.

Conclusions: Multimodal ML can help predict the outcome of CSF biomarkers in early AD by utilizing non-invasive and economically feasible variables. The integration of computational models into medical practice offers a promising tool for the screening of patients at risk of AD, potentially guiding clinical decisions.

Keywords: Alzheimer's disease; CSF biomarkers; Machine Learning; biomechanism; diagnosis; neurodegeneration; quantitative polysomnographic signal analysis; therapeutic target.

Grants and funding

The author(s) declare financial support was received for the research, authorship, and/or publication of this article. This work was partially funded by Ministerio de Ciencia, Innovación y Universidades, Agencia Estatal de Investigación, under grant PID2019-109820RB-I00, MCIN/AEI/10.13039/501100011033/, co-financed by European Regional Development Fund (ERDF), “A way of making Europe” and Spanish Ministry of Economy and Competitiveness, Institute of Health Carlos III (grant number P114/00328); by Generalitat of Catalonia, Department of Health (PERIS 2019 SLT008/18/00050) and Fundació La Marató TV3 (464/C/2014) to GP-R. Co-financed by FEDER funds from the European Union (“A way to build Europe”). IRBLleida is a CERCA Programme/Generalitat of Catalonia.