A study of abbreviations in clinical notes

AMIA Annu Symp Proc. 2007 Oct 11:2007:821-5.

Abstract

Various natural language processing (NLP) systems have been developed to unlock patient information from narrative clinical notes in order to support knowledge based applications such as error detection, surveillance and decision support. In many clinical notes, abbreviations are widely used without mention of their definitions, which is very different from the use of abbreviations in the biomedical literature. Thus, it is critical, but more challenging, for NLP systems to correctly interpret abbreviations in these notes. In this paper we describe a study of a two-step model for building a clinical abbreviation database: first, abbreviations in a text corpus were detected and then a sense inventory was built for those that were found. Four detection methods were developed and evaluated. Results showed that the best detection method had a precision of 91.4% and recall of 80.3%. A simple method was used to build sense inventories from two different knowledge sources: the Unified Medical Language System (UMLS) and a MEDLINE abbreviation database (ADAM). Evaluation showed the inventory from the UMLS appeared to be the more appropriate of the two for defining the sense of abbreviations, but was not ideal. It covered 35% of the senses and had an ambiguity rate of 40% for those that were covered. However, annotation by domain experts appears necessary for uncovered abbreviations and to determine the correct senses.

Publication types

  • Evaluation Study
  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Abbreviations as Topic*
  • Decision Trees
  • Humans
  • MEDLINE
  • Natural Language Processing*
  • Unified Medical Language System