Objective: Local reactions are the most common vaccine-related adverse event. There is no specific diagnosis code for local reaction due to vaccination. Previous vaccine safety studies used non-specific diagnosis codes to identify potential local reaction cases and confirmed the cases through manual chart review. In this study, a natural language processing (NLP) algorithm was developed to identify local reaction associated with tetanus-diphtheria-acellular pertussis (Tdap) vaccine in the Vaccine Safety Datalink.
Methods: Presumptive cases of local reactions were identified among members ≥ 11 years of age using ICD-9-CM codes in all care settings in the 1-6 days following a Tdap vaccination between 2012 and 2014. The clinical notes were searched for signs and symptoms consistent with local reaction. Information on the timing and the location of a sign or symptom was also extracted to help determine whether or not the sign or symptom was vaccine related. Reactions triggered by causes other than Tdap vaccination were excluded. The NLP algorithm was developed at the lead study site and validated on a stratified random sample of 500 patients from five institutions.
Results: The NLP algorithm achieved an overall weighted sensitivity of 87.9%, specificity of 92.8%, positive predictive value of 82.7%, and negative predictive value of 95.1%. In addition, using data at one site, the NLP algorithm identified 3326 potential Tdap-related local reactions that were not identified through diagnosis codes.
Conclusion: The NLP algorithm achieved high accuracy, and demonstrated the potential of NLP to reduce the efforts of manual chart review in vaccine safety studies.
Keywords: Clinical notes; Electronic health record; Natural language processing; Vaccine adverse event; Vaccine safety.
Copyright © 2019 Elsevier B.V. All rights reserved.