Information on adverse drug reactions-Proof of principle for a structured database that allows customization of drug information

Int J Med Inform. 2020 Jan:133:103970. doi: 10.1016/j.ijmedinf.2019.103970. Epub 2019 Sep 16.

Abstract

Background: The drug information most commonly requested by patients is to learn more about potential adverse drug reactions (ADRs) of their drugs. Such information should be customizable to individual information needs. While approaches to automatically aggregate ADRs by text-mining processes and establishment of respective databases are well known, further efforts to map additional ADR information are sparse, yet crucial for customization. In a proof-of-principle (PoP) study, we developed a database format demonstrating that natural language processing can further structure ADR information in a way that facilitates customization.

Methods: We developed the database in a 3-step process: (1) initial ADR extraction, (2) mapping of additional ADR information, and (3) review process. ADRs of 10 frequently prescribed active ingredients were initially extracted from their Summary of Product Characteristics (SmPC) by text-mining processes and mapped to Medical Dictionary for Regulatory Activities (MedDRA) terms. To further structure ADR information, we mapped 7 additional ADR characteristics (i.e. frequency, organ class, seriousness, lay perceptibility, onset, duration, and management strategies) to individual ADRs. In a PoP study, the process steps were assessed and tested. Initial ADR extraction was assessed by measuring precision, recall, and F1-scores (i.e. harmonic mean of precision and recall). Mapping of additional ADR information was assessed considering pre-defined parameters (i.e. correctness, errors, and misses) regarding the mapped ADR characteristics.

Results: Overall the SmPCs listed 393 ADRs with an average of 39.3 ± 18.1 ADRs per SmPC. For initial ADR extraction precision was 97.9% and recall was 93.2% leading to an F1-score of 95.5%. Regarding mapping of additional ADR information, the frequency information of 28.6 ± 18.4 ADRs for each SmPC was correctly mapped (72.8%). Overall 77 ADRs (20.6%) of the correctly extracted ADRs did not have a concise frequency stated in the SmPC and were consequently mapped with 'frequency not known'. Mapping of remaining ADR characteristics did not result in noteworthy errors or misses.

Conclusion: ADR information can be automatically extracted and mapped to corresponding MedDRA terms. Additionally, ADR information can be further structured considering additional ADR characteristics to facilitate customization to individual patient needs.

Keywords: Adverse drug reactions; MedDRA; Natural language processing; Patient empowerment; Structured drug information; Summary of product characteristics.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adverse Drug Reaction Reporting Systems*
  • Data Collection
  • Data Mining
  • Databases, Factual
  • Natural Language Processing