Cognitive Digital Biomarkers from Automated Transcription of Spoken Language

N Tavabi; D Stück; A Signorini; C Karjadi; T Al Hanai; M Sandoval; C Lemke; J Glass; S Hardy; M Lavallee; B Wasserman; T F A Ang; C M Nowak; R Kainkaryam; L Foschini; R Au

doi:10.14283/jpad.2022.66

Cognitive Digital Biomarkers from Automated Transcription of Spoken Language

J Prev Alzheimers Dis. 2022;9(4):791-800. doi: 10.14283/jpad.2022.66.

Authors

N Tavabi¹, D Stück, A Signorini, C Karjadi, T Al Hanai, M Sandoval, C Lemke, J Glass, S Hardy, M Lavallee, B Wasserman, T F A Ang, C M Nowak, R Kainkaryam, L Foschini, R Au

Affiliation

¹ Rhoda Au, 72 E. Concord Street, Boston University School of Medicine, Boston, MA 02118. Telephone: (617) 358-0089; email: [email protected].

PMID: 36281684
DOI: 10.14283/jpad.2022.66

Abstract

Background: Although patients with Alzheimer's disease and other cognitive-related neurodegenerative disorders may benefit from early detection, development of a reliable diagnostic test has remained elusive. The penetration of digital voice-recording technologies and multiple cognitive processes deployed when constructing spoken responses might offer an opportunity to predict cognitive status.

Objective: To determine whether cognitive status might be predicted from voice recordings of neuropsychological testing.

Design: Comparison of acoustic and (para)linguistic variables from low-quality automated transcriptions of neuropsychological testing (n = 200) versus variables from high-quality manual transcriptions (n = 127). We trained a logistic regression classifier to predict cognitive status, which was tested against actual diagnoses.

Setting: Observational cohort study.

Participants: 146 participants in the Framingham Heart Study.

Measurements: Acoustic and either paralinguistic variables (e.g., speaking time) from automated transcriptions or linguistic variables (e.g., phrase complexity) from manual transcriptions.

Results: Models based on demographic features alone were not robust (area under the receiver-operator characteristic curve [AUROC] 0.60). Addition of clinical and standard acoustic features boosted the AUROC to 0.81. Additional inclusion of transcription-related features yielded an AUROC of 0.90.

Conclusions: The use of voice-based digital biomarkers derived from automated processing methods, combined with standard patient screening, might constitute a scalable way to enable early detection of dementia.

Keywords: AD screening; Dementia; biomarkers; predictive modeling.

Publication types

Observational Study
Research Support, N.I.H., Extramural

MeSH terms

Biomarkers
Cognition
Cognitive Dysfunction* / diagnosis
Humans
Language
Sensitivity and Specificity

Substances

Biomarkers