Two-Phase chief complaint mapping to the UMLS metathesaurus in Korean electronic medical records

IEEE Trans Inf Technol Biomed. 2009 Jan;13(1):78-86. doi: 10.1109/TITB.2008.2007103.

Abstract

The task of automatically determining the concepts referred to in chief complaint (CC) data from electronic medical records (EMRs) is an essential component of many EMR applications aimed at biosurveillance for disease outbreaks. Previous approaches that have been used for this concept mapping have mainly relied on term-level matching, whereby the medical terms in the raw text and their synonyms are matched with concepts in a terminology database. These previous approaches, however, have shortcomings that limit their efficacy in CC concept mapping, where the concepts for CC data are often represented by associative terms rather than by synonyms. Therefore, herein we propose a concept mapping scheme based on a two-phase matching approach, especially for application to Korean CCs, which uses term-level complete matching in the first phase and concept-level matching based on concept learning in the second phase. The proposed concept-level matching suggests the method to learn all the terms (associative terms as well as synonyms) that represent the concept and predict the most probable concept for a CC based on the learned terms. Experiments on 1204 CCs extracted from 15,618 discharge summaries of Korean EMRs showed that the proposed method gave significantly improved F-measure values compared to the baseline system, with improvements of up to 73.57%.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Abstracting and Indexing
  • Algorithms
  • Artificial Intelligence
  • Bayes Theorem
  • Humans
  • Korea
  • Medical Informatics / methods*
  • Medical Records Systems, Computerized*
  • Natural Language Processing*
  • Terminology as Topic
  • Unified Medical Language System*