Importance: Approaches are needed to stratify individuals in early psychosis stages beyond positive symptom severity to investigate specificity related to affective and normative variation and to validate solutions with premorbid, longitudinal, and genetic risk measures.
Objective: To use machine learning techniques to cluster, compare, and combine subgroup solutions using clinical and brain structural imaging data from early psychosis and depression stages.
Design, setting, and participants: A multisite, naturalistic, longitudinal cohort study (10 sites in 5 European countries; including major follow-up intervals at 9 and 18 months) with a referred patient sample of those with clinical high risk for psychosis (CHR-P), recent-onset psychosis (ROP), recent-onset depression (ROD), and healthy controls were recruited between February 1, 2014, to July 1, 2019. Data were analyzed between January 2020 and January 2022.
Main outcomes and measures: A nonnegative matrix factorization technique separately decomposed clinical (287 variables) and parcellated brain structural volume (204 gray, white, and cerebrospinal fluid regions) data across CHR-P, ROP, ROD, and healthy controls study groups. Stability criteria determined cluster number using nested cross-validation. Validation targets were compared across subgroup solutions (premorbid, longitudinal, and schizophrenia polygenic risk scores). Multiclass supervised machine learning produced a transferable solution to the validation sample.
Results: There were a total of 749 individuals in the discovery group and 610 individuals in the validation group. Individuals included those with CHR-P (n = 287), ROP (n = 323), ROD (n = 285), and healthy controls (n = 464), The mean (SD) age was 25.1 (5.9) years, and 702 (51.7%) were female. A clinical 4-dimensional solution separated individuals based on positive symptoms, negative symptoms, depression, and functioning, demonstrating associations with all validation targets. Brain clustering revealed a subgroup with distributed brain volume reductions associated with negative symptoms, reduced performance IQ, and increased schizophrenia polygenic risk scores. Multilevel results distinguished between normative and illness-related brain differences. Subgroup results were largely validated in the external sample.
Conclusions and relevance: The results of this longitudinal cohort study provide stratifications beyond the expression of positive symptoms that cut across illness stages and diagnoses. Clinical results suggest the importance of negative symptoms, depression, and functioning. Brain results suggest substantial overlap across illness stages and normative variation, which may highlight a vulnerability signature independent from specific presentations. Premorbid, longitudinal, and genetic risk validation suggested clinical importance of the subgroups to preventive treatments.