Construction of an Assisted Model Based on Natural Language Processing for Automatic Early Diagnosis of Autoimmune Encephalitis

Neurol Ther. 2022 Sep;11(3):1117-1134. doi: 10.1007/s40120-022-00355-7. Epub 2022 May 11.

Abstract

Introduction: Early diagnosis and etiological treatment can effectively improve the prognosis of patients with autoimmune encephalitis (AE). However, anti-neuronal antibody tests which provide the definitive diagnosis require time and are not always abnormal. By using natural language processing (NLP) technology, our study proposes an assisted diagnostic method for early clinical diagnosis of AE and compares its sensitivity with that of previously established criteria.

Methods: Our model is based on the text classification model trained by the history of present illness (HPI) in electronic medical records (EMRs) that present a definite pathological diagnosis of AE or infectious encephalitis (IE). The definitive diagnosis of IE was based on the results of traditional etiological examinations. The definitive diagnosis of AE was based on the results of neuronal antibodies, and the diagnostic criteria of definite autoimmune limbic encephalitis proposed by Graus et al. used as the reference standard for antibody-negative AE. First, we automatically recognized and extracted symptoms for all HPI texts in EMRs by training a dataset of 552 cases. Second, four text classification models trained by a dataset of 199 cases were established for differential diagnosis of AE and IE based on a post-structuring text dataset of every HPI, which was completed using symptoms in English language after the process of normalization of synonyms. The optimal model was identified by evaluating and comparing the performance of the four models. Finally, combined with three typical symptoms and the results of standard paraclinical tests such as cerebrospinal fluid (CSF), magnetic resonance imaging (MRI), or electroencephalogram (EEG) proposed from Graus criteria, an assisted early diagnostic model for AE was established on the basis of the text classification model with the best performance.

Results: The comparison results for the four models applied to the independent testing dataset showed the naïve Bayesian classifier with bag of words achieved the best performance, with an area under the receiver operating characteristic curve of 0.85, accuracy of 84.5% (95% confidence interval [CI] 74.0-92.0%), sensitivity of 86.7% (95% CI 69.3-96.2%), and specificity of 82.9% (95% CI 67.9-92.8%), respectively. Compared with the diagnostic criteria proposed previously, the early diagnostic sensitivity for possible AE using the assisted diagnostic model based on the independent testing dataset was improved from 73.3% (95% CI 54.1-87.7%) to 86.7% (95% CI 69.3-96.2%).

Conclusions: The assisted diagnostic model could effectively increase the early diagnostic sensitivity for AE compared to previous diagnostic criteria, assist physicians in establishing the diagnosis of AE automatically after inputting the HPI and the results of standard paraclinical tests according to their narrative habits for describing symptoms, avoiding misdiagnosis and allowing for prompt initiation of specific treatment.

Keywords: Autoimmune encephalitis; Computer-assisted diagnosis; Early diagnosis; Electronic medical records; Natural language processing.