Data-driven subtyping of Parkinson's disease: comparison of current methodologies and application to the Bochum PNS cohort

J Neural Transm (Vienna). 2023 Jun;130(6):763-776. doi: 10.1007/s00702-023-02627-4. Epub 2023 Mar 31.

Abstract

Considerable efforts have been made to better describe and identify Parkinson's disease (PD) subtypes. Cluster analyses have been proposed as an unbiased development approach for PD subtypes that could facilitate their identification, tracking of progression, and evaluation of therapeutic responses. A data-driven clustering analysis was applied to a PD cohort of 114 subjects enrolled at St. Josef-Hospital of the Ruhr University in Bochum (Germany). A wide spectrum of motor and non-motor scores including polyneuropathy-related measures was included into the analysis. K-means and hierarchical agglomerative clustering were performed to identify PD subtypes. Silhouette and Calinski-Harabasz Score Elbow were then employed as supporting evaluation metrics for determining the optimal number of clusters. Principal Component Analysis (PCA), analysis of variance (ANOVA), and analysis of covariance (ANCOVA) were conducted to determine the relevance of each score for the clusters' definition. Three PD cluster subtypes were identified: early onset mild type, intermediate type, and late-onset severe type. The between-cluster analysis consistently showed highly significant differences (P < 0.01), except for one of the scores measuring polyneuropathy (Neuropathy Disability Score; P = 0.609) and Levodopa dosage (P = 0.226). Parkinson's Disease Questionnaire (PDQ-39), Non-motor Symptom Questionnaire (NMSQuest), and the MDS-UPDRS Part II were found to be crucial factors for PD subtype differentiation. The present analysis identifies a specific set of criteria for PD subtyping based on an extensive panel of clinical and paraclinical scores. This analysis provides a foundation for further development of PD subtyping, including k-means and hierarchical agglomerative clustering.Trial registration: DRKS00020752, February 7, 2020, retrospectively registered.

Keywords: Clustering; Data-driven analysis; Machine learning; Parkinson’s disease; Subtype.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Germany
  • Humans
  • Levodopa / therapeutic use
  • Mental Status and Dementia Tests
  • Parkinson Disease* / diagnosis
  • Parkinson Disease* / drug therapy

Substances

  • Levodopa

Associated data

  • DRKS/DRKS00020752