-
NeuraHealth: An Automated Screening Pipeline to Detect Undiagnosed Cognitive Impairment in Electronic Health Records with Deep Learning and Natural Language Processing
Authors:
Tanish Tyagi,
Colin G. Magdamo,
Ayush Noori,
Zhaozhi Li,
Xiao Liu,
Mayuresh Deodhar,
Zhuoqiao Hong,
Wendong Ge,
Elissa M. Ye,
Yi-han Sheu,
Haitham Alabsi,
Laura Brenner,
Gregory K. Robbins,
Sahar Zafar,
Nicole Benson,
Lidia Moura,
John Hsu,
Alberto Serrano-Pozo,
Dimitry Prokopenko,
Rudolph E. Tanzi,
Bradley T. Hyman,
Deborah Blacker,
Shibani S. Mukerji,
M. Brandon Westover,
Sudeshna Das
Abstract:
Dementia related cognitive impairment (CI) is a neurodegenerative disorder, affecting over 55 million people worldwide and growing rapidly at the rate of one new case every 3 seconds. 75% cases go undiagnosed globally with up to 90% in low-and-middle-income countries, leading to an estimated annual worldwide cost of USD 1.3 trillion, forecasted to reach 2.8 trillion by 2030. With no cure, a recurr…
▽ More
Dementia related cognitive impairment (CI) is a neurodegenerative disorder, affecting over 55 million people worldwide and growing rapidly at the rate of one new case every 3 seconds. 75% cases go undiagnosed globally with up to 90% in low-and-middle-income countries, leading to an estimated annual worldwide cost of USD 1.3 trillion, forecasted to reach 2.8 trillion by 2030. With no cure, a recurring failure of clinical trials, and a lack of early diagnosis, the mortality rate is 100%. Information in electronic health records (EHR) can provide vital clues for early detection of CI, but a manual review by experts is tedious and error prone. Several computational methods have been proposed, however, they lack an enhanced understanding of the linguistic context in complex language structures of EHR. Therefore, I propose a novel and more accurate framework, NeuraHealth, to identify patients who had no earlier diagnosis. In NeuraHealth, using patient EHR from Mass General Brigham BioBank, I fine-tuned a bi-directional attention-based deep learning natural language processing model to classify sequences. The sequence predictions were used to generate structured features as input for a patient level regularized logistic regression model. This two-step framework creates high dimensionality, outperforming all existing state-of-the-art computational methods as well as clinical methods. Further, I integrate the models into a real-world product, a web app, to create an automated EHR screening pipeline for scalable and high-speed discovery of undetected CI in EHR, making early diagnosis viable in medical facilities and in regions with scarce health services.
△ Less
Submitted 20 June, 2022; v1 submitted 12 January, 2022;
originally announced February 2022.
-
Using Deep Learning to Identify Patients with Cognitive Impairment in Electronic Health Records
Authors:
Tanish Tyagi,
Colin G. Magdamo,
Ayush Noori,
Zhaozhi Li,
Xiao Liu,
Mayuresh Deodhar,
Zhuoqiao Hong,
Wendong Ge,
Elissa M. Ye,
Yi-han Sheu,
Haitham Alabsi,
Laura Brenner,
Gregory K. Robbins,
Sahar Zafar,
Nicole Benson,
Lidia Moura,
John Hsu,
Alberto Serrano-Pozo,
Dimitry Prokopenko,
Rudolph E. Tanzi,
Bradley T. Hyman,
Deborah Blacker,
Shibani S. Mukerji,
M. Brandon Westover,
Sudeshna Das
Abstract:
Dementia is a neurodegenerative disorder that causes cognitive decline and affects more than 50 million people worldwide. Dementia is under-diagnosed by healthcare professionals - only one in four people who suffer from dementia are diagnosed. Even when a diagnosis is made, it may not be entered as a structured International Classification of Diseases (ICD) diagnosis code in a patient's charts. In…
▽ More
Dementia is a neurodegenerative disorder that causes cognitive decline and affects more than 50 million people worldwide. Dementia is under-diagnosed by healthcare professionals - only one in four people who suffer from dementia are diagnosed. Even when a diagnosis is made, it may not be entered as a structured International Classification of Diseases (ICD) diagnosis code in a patient's charts. Information relevant to cognitive impairment (CI) is often found within electronic health records (EHR), but manual review of clinician notes by experts is both time consuming and often prone to errors. Automated mining of these notes presents an opportunity to label patients with cognitive impairment in EHR data. We developed natural language processing (NLP) tools to identify patients with cognitive impairment and demonstrate that linguistic context enhances performance for the cognitive impairment classification task. We fine-tuned our attention based deep learning model, which can learn from complex language structures, and substantially improved accuracy (0.93) relative to a baseline NLP model (0.84). Further, we show that deep learning NLP can successfully identify dementia patients without dementia-related ICD codes or medications.
△ Less
Submitted 12 November, 2021;
originally announced November 2021.
-
Sleep Apnea and Respiratory Anomaly Detection from a Wearable Band and Oxygen Saturation
Authors:
Wolfgang Ganglberger,
Abigail A. Bucklin,
Ryan A. Tesh,
Madalena Da Silva Cardoso,
Haoqi Sun,
Michael J. Leone,
Luis Paixao,
Ezhil Panneerselvam,
Elissa M. Ye,
B. Taylor Thompson,
Oluwaseun Akeju,
David Kuller,
Robert J. Thomas,
M. Brandon Westover
Abstract:
Objective: Sleep related respiratory abnormalities are typically detected using polysomnography. There is a need in general medicine and critical care for a more convenient method to automatically detect sleep apnea from a simple, easy-to-wear device. The objective is to automatically detect abnormal respiration and estimate the Apnea-Hypopnea-Index (AHI) with a wearable respiratory device, compar…
▽ More
Objective: Sleep related respiratory abnormalities are typically detected using polysomnography. There is a need in general medicine and critical care for a more convenient method to automatically detect sleep apnea from a simple, easy-to-wear device. The objective is to automatically detect abnormal respiration and estimate the Apnea-Hypopnea-Index (AHI) with a wearable respiratory device, compared to an SpO2 signal or polysomnography using a large (n = 412) dataset serving as ground truth. Methods: Simultaneously recorded polysomnographic (PSG) and wearable respiratory effort data were used to train and evaluate models in a cross-validation fashion. Time domain and complexity features were extracted, important features were identified, and a random forest model employed to detect events and predict AHI. Four models were trained: one each using the respiratory features only, a feature from the SpO2 (%)-signal only, and two additional models that use the respiratory features and the SpO2 (%)-feature, one allowing a time lag of 30 seconds between the two signals. Results: Event-based classification resulted in areas under the receiver operating characteristic curves of 0.94, 0.86, 0.82, and areas under the precision-recall curves of 0.48, 0.32, 0.51 for the models using respiration and SpO2, respiration-only, and SpO2-only respectively. Correlation between expert-labelled and predicted AHI was 0.96, 0.78, and 0.93, respectively. Conclusions: A wearable respiratory effort signal with or without SpO2 predicted AHI accurately. Given the large dataset and rigorous testing design, we expect our models are generalizable to evaluating respiration in a variety of environments, such as at home and in critical care.
△ Less
Submitted 23 February, 2021;
originally announced February 2021.
-
Natural Language Processing to Detect Cognitive Concerns in Electronic Health Records Using Deep Learning
Authors:
Zhuoqiao Hong,
Colin G. Magdamo,
Yi-han Sheu,
Prathamesh Mohite,
Ayush Noori,
Elissa M. Ye,
Wendong Ge,
Haoqi Sun,
Laura Brenner,
Gregory Robbins,
Shibani Mukerji,
Sahar Zafar,
Nicole Benson,
Lidia Moura,
John Hsu,
Bradley T. Hyman,
Michael B. Westover,
Deborah Blacker,
Sudeshna Das
Abstract:
Dementia is under-recognized in the community, under-diagnosed by healthcare professionals, and under-coded in claims data. Information on cognitive dysfunction, however, is often found in unstructured clinician notes within medical records but manual review by experts is time consuming and often prone to errors. Automated mining of these notes presents a potential opportunity to label patients wi…
▽ More
Dementia is under-recognized in the community, under-diagnosed by healthcare professionals, and under-coded in claims data. Information on cognitive dysfunction, however, is often found in unstructured clinician notes within medical records but manual review by experts is time consuming and often prone to errors. Automated mining of these notes presents a potential opportunity to label patients with cognitive concerns who could benefit from an evaluation or be referred to specialist care. In order to identify patients with cognitive concerns in electronic medical records, we applied natural language processing (NLP) algorithms and compared model performance to a baseline model that used structured diagnosis codes and medication data only. An attention-based deep learning model outperformed the baseline model and other simpler models.
△ Less
Submitted 12 November, 2020;
originally announced November 2020.