Concept encoding, which maps text spans to concepts in standard terminologies, is a critical component in clinical natural language processing (NLP) systems to allow semantic interoperability with other clinical applications. A majority of clinical NLP systems adopt dictionary or lexicon based approaches and the performance of concept encoding is often evaluated using a human created gold standard generated with reference to the most up-to-date standard terminologies available at the time of gold standard creation. With the advance of medical science, standard terminologies or dictionaries can evolve. However, it remains unknown whether the dictionary updates will impact the performance of concept encoding. In this study, we evaluated the annotation performance of two clinical NLP systems, cTAKES and MedXN based on updated dictionaries to gain further insights. Specifically, we compared the automatic annotation results with previously manually generated gold standards. The results of our study demonstrate the annotation changes based on dictionary updates in clinical NLP systems and that it is necessary to do temporal management for gold standards, which raises the need for appropriate terminology management tools for back version compatibility to update gold standards.
Keywords: concept encoding,; dictionary update,; gold standards; natural language processing,.