Word Sense Disambiguation of clinical abbreviations with hyperdimensional computing

AMIA Annu Symp Proc. 2013 Nov 16:2013:1007-16. eCollection 2013.

Abstract

Automated Word Sense Disambiguation in clinical documents is a prerequisite to accurate extraction of medical information. Emerging methods utilizing hyperdimensional computing present new approaches to this problem. In this paper, we evaluate one such approach, the Binary Spatter Code Word Sense Disambiguation algorithm, on 50 ambiguous abbreviation sets derived from clinical notes. This algorithm uses reversible vector transformations to encode ambiguous terms and their context-specific senses into vectors representing surrounding terms. The sense for a new context is then inferred from vectors representing the terms it contains. One-to-one BSC-WSD achieves average accuracy of 94.55% when considering the orientation and distance of neighboring terms relative to the target abbreviation, outperforming Support Vector Machine and Naïve Bayes classifiers. Furthermore, it is practical to deal with all 50 abbreviations in an identical manner using a single one-to-many BSC-WSD model with average accuracy of 93.91%, which is not possible with common machine learning algorithms.

Publication types

  • Evaluation Study
  • Research Support, N.I.H., Extramural

MeSH terms

  • Abbreviations as Topic*
  • Algorithms*
  • Bayes Theorem
  • Natural Language Processing
  • Support Vector Machine