Using Deep Learning to Improve Phenotyping from Clinical Reports

Stud Health Technol Inform. 2022 Jun 6:290:282-286. doi: 10.3233/SHTI220079.

Abstract

With the development of clinical databases and the ubiquity of EHRs, physicians and researchers alike have access to an unprecedented amount of data. Complexity of the available data has also increased since clinical reports are also included and require frameworks with natural language processing capabilities in order to process them and extract information not found in other types of documents. In the following work we implement a data processing pipeline performing phenotyping, disambiguation, negation and subject prediction on such reports. We compare it to an existing solution routinely used in a children's hospital with special focus on genetic diseases. We show that by replacing components based on rules and pattern matching with components leveraging deep learning models and fine-tuned word embeddings we obtain performance improvements of 7%, 10% and 27% in terms of F1 measure for each task. The solution we devised will help build more reliable decision support systems.

Keywords: Data Warehousing; Deep Learning; Natural Language Processing; Phenotype.

MeSH terms

  • Child
  • Databases, Factual
  • Deep Learning*
  • Humans
  • Natural Language Processing