Automated extraction of ejection fraction for quality measurement using regular expressions in Unstructured Information Management Architecture (UIMA) for heart failure

J Am Med Inform Assoc. 2012 Sep-Oct;19(5):859-66. doi: 10.1136/amiajnl-2011-000535. Epub 2012 Mar 21.

Abstract

Objectives: Left ventricular ejection fraction (EF) is a key component of heart failure quality measures used within the Department of Veteran Affairs (VA). Our goals were to build a natural language processing system to extract the EF from free-text echocardiogram reports to automate measurement reporting and to validate the accuracy of the system using a comparison reference standard developed through human review. This project was a Translational Use Case Project within the VA Consortium for Healthcare Informatics.

Materials and methods: We created a set of regular expressions and rules to capture the EF using a random sample of 765 echocardiograms from seven VA medical centers. The documents were randomly assigned to two sets: a set of 275 used for training and a second set of 490 used for testing and validation. To establish the reference standard, two independent reviewers annotated all documents in both sets; a third reviewer adjudicated disagreements.

Results: System test results for document-level classification of EF of <40% had a sensitivity (recall) of 98.41%, a specificity of 100%, a positive predictive value (precision) of 100%, and an F measure of 99.2%. System test results at the concept level had a sensitivity of 88.9% (95% CI 87.7% to 90.0%), a positive predictive value of 95% (95% CI 94.2% to 95.9%), and an F measure of 91.9% (95% CI 91.2% to 92.7%).

Discussion: An EF value of <40% can be accurately identified in VA echocardiogram reports.

Conclusions: An automated information extraction system can be used to accurately extract EF for quality measurement.

Publication types

  • Multicenter Study
  • Research Support, U.S. Gov't, Non-P.H.S.
  • Validation Study

MeSH terms

  • Data Mining / methods*
  • Echocardiography
  • Heart Failure* / diagnostic imaging
  • Heart Failure* / therapy
  • Humans
  • Medical Records Systems, Computerized*
  • Natural Language Processing*
  • Quality Indicators, Health Care*
  • Reference Standards
  • Software Validation
  • Stroke Volume*
  • United States
  • United States Department of Veterans Affairs