Background: Transfusion-related adverse events can be unrecognized and unreported. As part of the US Food and Drug Administration's Center for Biologics Evaluation and Research Biologics Effectiveness and Safety initiative, we explored whether machine learning methods, such as natural language processing (NLP), can identify and report transfusion allergic reactions (ARs) from electronic health records (EHRs).
Study design and methods: In a 4-year period, all 146 reported transfusion ARs were pulled from a database of 86,764 transfusions in an academic health system, along with a random sample of 605 transfusions without reported ARs. Structured and unstructured EHR data were retrieved, including demographics, new symptoms, medications, and lab results. In unstructured data, evidence from clinicians' notes, test results, and prescriptions fields identified transfusion ARs, which were used to extract NLP features. Clinician reviews of selected validation cases assessed and confirmed model performance.
Results: Clinician reviews of selected validation cases yielded a sensitivity of 67.9% and a specificity of 97.5% at a threshold of 0.9, with a positive predictive value (PPV) of 84%, estimated to 4.5% when extrapolated to match transfusion AR incidence in the full transfusion dataset. A higher threshold achieved sensitivity of 43% with specificity/PPV of 100% in our validation set. Essential features predicting ARs were recognized transfusion reactions, administration of antihistamines or glucocorticoids, and skin symptoms (e.g., hives and itching). Removal of NLP features decreased model performance.
Discussion: NLP algorithms can identify transfusion reactions from the EHR with a reasonable level of precision for subsequent clinician review and confirmation.
Keywords: adverse events; allergic transfusion reactions; machine learning; natural language processing.
© 2022 AABB. This article has been contributed to by U.S. Government employees and their work is in the public domain in the USA.