Automated Identification of Clinical Procedures in Free-Text Electronic Clinical Records with a Low-Code Named Entity Recognition Workflow

Methods Inf Med. 2022 Sep;61(3-04):84-89. doi: 10.1055/s-0042-1749358. Epub 2022 Sep 12.

Abstract

Background: Clinical procedures are often performed in outpatient clinics without prior scheduling at the administrative level, and documentation of the procedure often occurs solely in free-text clinical electronic notes. Natural language processing (NLP), particularly named entity recognition (NER), may provide a solution to extracting procedure data from free-text electronic notes.

Methods: Free-text notes from outpatient ophthalmology visits were collected from the electronic clinical records at a single institution over 3 months. The Prodigy low-code annotation tool was used to create an annotation dataset and train a custom NER model for clinical procedures. Clinical procedures were extracted from the entire set of clinical notes.

Results: There were a total of 5,098 clinic notes extracted for the study period; 1,923 clinic notes were used to build the NER model, which included a total of 231 manual annotations. The NER model achieved an F-score of 0.767, a precision of 0.810, and a recall of 0.729. The most common procedures performed included intravitreal injections of therapeutic substances, removal of corneal foreign bodies, and epithelial debridement of corneal ulcers.

Conclusion: The use of a low-code annotation software tool allows the rapid creation of a custom annotation dataset to train a NER model to identify clinical procedures stored in free-text electronic clinical notes. This enables clinicians to rapidly gather previously unidentified procedural data for quality improvement and auditing purposes. Low-code annotation tools may reduce time and coding barriers to clinician participation in NLP research.

MeSH terms

  • Documentation*
  • Electronic Health Records
  • Electronics
  • Natural Language Processing*
  • Software
  • Workflow