High-Throughput CSF Proteomics and Machine Learning to Identify Proteomic Signatures for Parkinson Disease Development and Progression

Neurology. 2023 Oct 3;101(14):e1434-e1447. doi: 10.1212/WNL.0000000000207725. Epub 2023 Aug 16.

Abstract

Background and objectives: This study aimed to identify CSF proteomic signatures characteristic of Parkinson disease (PD) and evaluate their clinical utility.

Methods: This observational study used data from the Parkinson's Progression Markers Initiative (PPMI), which enrolled patients with PD, healthy controls (HCs), and non-PD participants carrying GBA1, LRRK2, and/or SNCA pathogenic variants (genetic prodromals) at international sites. Study participants were chosen from PPMI enrollees based on the availability of aptamer-based CSF proteomic data, quantifying 4,071 proteins, and classified as patients with PD without GBA1, LRRK2, and/or SNCA pathogenic variants (nongenetic PD), HCs, patients with PD carrying the aforementioned pathogenic variants (genetic PD), or genetic prodromals. Differentially expressed protein (DEP) analysis and the least absolute shrinkage and selection operator (LASSO) were applied to the data from nongenetic PD and HCs. Signatures characteristics of nongenetic PD were quantified as a PD proteomic score (PD-ProS), validated internally and then externally using data of 1,556 CSF proteins from the LRRK2 Cohort Consortium (LCC). We further tested the PD-ProS in genetic PD and genetic prodromals and examined associations with clinical progression.

Results: Data from 279 patients with nongenetic PD (mean ± SD, age 62.0 ± 9.6 years; male 67.7%) and 141 HCs (age 60.5 ± 11.9 years; male 64.5%) were used for PD-ProS derivation. From 23 DEPs, LASSO determined weights of 14 DEPs for the PD-ProS (area under the curve [AUC] 0.83, 95% CI 0.78-0.87), validated in an independent internal validation cohort of 71 patients with nongenetic PD and 35 HCs (AUC 0.81, 95% CI 0.73-0.90). In the LCC, only 5 of the 14 DEPs were also measured. Notably, these 5 DEPs still distinguished 34 patients with nongenetic PD from 31 HCs with the same weights (AUC 0.75, 95% CI 0.63-0.87). Furthermore, the PD-ProS distinguished 258 patients with genetic PD from 365 genetic prodromals. Finally, regardless of genetic status, the PD-ProS independently predicted both cognitive and motor decline in PD (dementia, adjusted hazard ratio in the highest quintile [aHR-Q5] 2.8 [95% CI 1.6-5.0]; Hoehn and Yahr stage IV, aHR-Q5 2.1 [95% CI 1.1-4.0]).

Discussion: By integrating high-throughput proteomics with machine learning, we identified PD-associated CSF proteomic signatures crucial for PD development and progression.

Trial registration information: ClinicalTrials.gov (NCT01176565). A link to the trial registry page is clinicaltrials.gov/ct2/show/NCT01141023.

Classification of evidence: This study provides Class II evidence that the CSF proteome contains clinically important information regarding the development and progression of Parkinson disease that can be deciphered by a combination of high-throughput proteomics and machine learning.

Publication types

  • Observational Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Aged
  • Disease Progression
  • Humans
  • Machine Learning
  • Male
  • Middle Aged
  • Parkinson Disease* / complications
  • Parkinson Disease* / genetics
  • Proportional Hazards Models
  • Proteomics

Associated data

  • ClinicalTrials.gov/NCT01176565
  • ClinicalTrials.gov/NCT01141023