Using Neural Multi-task Learning to Extract Substance Abuse Information from Clinical Notes

AMIA Annu Symp Proc. 2018 Dec 5:2018:1395-1404. eCollection 2018.

Abstract

Substance abuse carries many negative health consequences. Detailed information about patients' substance abuse history is usually captured in free-text clinical notes. Automatic extraction of substance abuse information is vital to assess patients' risk for developing certain diseases and adverse outcomes. We introduce a novel neural architecture to automatically extract substance abuse information. The model, which uses multi-task learning, outperformed previous work and several baselines created using discrete models. The classifier obtained 0.88-0.95 F1 for detecting substance abuse status (current, none, past, unknown) on a withheld test set. Other substance abuse entities (amount, frequency, exposure history, quit history, and type) were also extracted with high-performance. Our results demonstrate the feasibility of extracting substance abuse information with little annotated data. Additionally, we used the neural multi-task model to automatically annotate 59.7K notes from a different source. Manual review of a subset of these notes resulted 0.84-0.89 precision for substance abuse status.

MeSH terms

  • Algorithms
  • Electronic Health Records*
  • Female
  • Humans
  • Information Storage and Retrieval / methods*
  • Machine Learning*
  • Male
  • Neural Networks, Computer*
  • Substance Abuse Detection / methods*
  • Substance-Related Disorders / diagnosis*