Automated misspelling detection and correction in clinical free-text records

J Biomed Inform. 2015 Jun:55:188-95. doi: 10.1016/j.jbi.2015.04.008. Epub 2015 Apr 24.

Abstract

Accurate electronic health records are important for clinical care and research as well as ensuring patient safety. It is crucial for misspelled words to be corrected in order to ensure that medical records are interpreted correctly. This paper describes the development of a spelling correction system for medical text. Our spell checker is based on Shannon's noisy channel model, and uses an extensive dictionary compiled from many sources. We also use named entity recognition, so that names are not wrongly corrected as misspellings. We apply our spell checker to three different types of free-text data: clinical notes, allergy entries, and medication orders; and evaluate its performance on both misspelling detection and correction. Our spell checker achieves detection performance of up to 94.4% and correction accuracy of up to 88.2%. We show that high-performance spelling correction is possible on a variety of clinical documents.

Keywords: Electronic health record; Named entity recognition; Natural language processing; Spelling correction.

Publication types

  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Data Accuracy*
  • Electronic Health Records / organization & administration*
  • Machine Learning
  • Meaningful Use / organization & administration
  • Natural Language Processing*
  • Quality Assurance, Health Care / methods*
  • Vocabulary, Controlled*
  • Word Processing / methods*
  • Word Processing / standards