A semi-supervised approach for extracting TCM clinical terms based on feature words

BMC Med Inform Decis Mak. 2020 Jul 9;20(Suppl 3):118. doi: 10.1186/s12911-020-1108-1.

Abstract

Background: A semi-supervised model is proposed for extracting clinical terms of Traditional Chinese Medicine using feature words.

Methods: The extraction model is based on BiLSTM-CRF and combined with semi-supervised learning and feature word set, which reduces the cost of manual annotation and leverage extraction results.

Results: Experiment results show that the proposed model improves the extraction of five types of TCM clinical terms, including traditional Chinese medicine, symptoms, patterns, diseases and formulas. The best F1-value of the experiment reaches 78.70% on the test dataset.

Conclusions: This method can reduce the cost of manual labeling and improve the result in the NER research of TCM clinical terms.

Keywords: Clinical terms; Deep learning; NER; Semi-supervised; TCM.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Humans
  • Medicine, Chinese Traditional*
  • Supervised Machine Learning*