Dementia and mild cognitive impairment can be underrecognized in primary care practice and research. Free-text fields in electronic medical records (EMRs) are a rich source of information which might support increased detection and enable a better understanding of populations at risk of dementia. We used natural language processing (NLP) to identify dementia-related features in EMRs and compared the performance of supervised machine learning models to classify patients with dementia. We assembled a cohort of primary care patients aged 66 + years in Ontario, Canada, from EMR notes collected until December 2016: 526 with dementia and 44,148 without dementia. We identified dementia-related features by applying published lists, clinician input, and NLP with word embeddings to free-text progress and consult notes and organized features into thematic groups. Using machine learning models, we compared the performance of features to detect dementia, overall and during time periods relative to dementia case ascertainment in health administrative databases. Over 900 dementia-related features were identified and grouped into eight themes (including symptoms, social, function, cognition). Using notes from all time periods, LASSO had the best performance (F1 score: 77.2%, sensitivity: 71.5%, specificity: 99.8%). Model performance was poor when notes written before case ascertainment were included (F1 score: 14.4%, sensitivity: 8.3%, specificity 99.9%) but improved as later notes were added. While similar models may eventually improve recognition of cognitive issues and dementia in primary care EMRs, our findings suggest that further research is needed to identify which additional EMR components might be useful to promote early detection of dementia.
Supplementary information: The online version contains supplementary material available at 10.1007/s41666-023-00125-6.
Keywords: Artificial intelligence; Dementia; Electronic health records; Natural language processing; Primary health care.
© The Author(s), under exclusive licence to Springer Nature Switzerland AG 2023, Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.