BioDEX: Large-Scale Biomedical Adverse Drug Event Extraction for Real-World Pharmacovigilance

D'Oosterlinck, Karel; Remy, François; Deleu, Johannes; Demeester, Thomas; Develder, Chris; Zaporojets, Klim; Ghodsi, Aneiss; Ellershaw, Simon; Collins, Jack; Potts, Christopher

Computer Science > Computation and Language

arXiv:2305.13395 (cs)

[Submitted on 22 May 2023 (v1), last revised 20 Oct 2023 (this version, v2)]

Title:BioDEX: Large-Scale Biomedical Adverse Drug Event Extraction for Real-World Pharmacovigilance

Authors:Karel D'Oosterlinck, François Remy, Johannes Deleu, Thomas Demeester, Chris Develder, Klim Zaporojets, Aneiss Ghodsi, Simon Ellershaw, Jack Collins, Christopher Potts

View PDF

Abstract:Timely and accurate extraction of Adverse Drug Events (ADE) from biomedical literature is paramount for public safety, but involves slow and costly manual labor. We set out to improve drug safety monitoring (pharmacovigilance, PV) through the use of Natural Language Processing (NLP). We introduce BioDEX, a large-scale resource for Biomedical adverse Drug Event Extraction, rooted in the historical output of drug safety reporting in the U.S. BioDEX consists of 65k abstracts and 19k full-text biomedical papers with 256k associated document-level safety reports created by medical experts. The core features of these reports include the reported weight, age, and biological sex of a patient, a set of drugs taken by the patient, the drug dosages, the reactions experienced, and whether the reaction was life threatening. In this work, we consider the task of predicting the core information of the report given its originating paper. We estimate human performance to be 72.0% F1, whereas our best model achieves 62.3% F1, indicating significant headroom on this task. We also begin to explore ways in which these models could help professional PV reviewers. Our code and data are available: this https URL.

Comments:	28 pages. EMNLP Findings 2023
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2305.13395 [cs.CL]
	(or arXiv:2305.13395v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2305.13395

Submission history

From: Karel D'Oosterlinck [view email]
[v1] Mon, 22 May 2023 18:15:57 UTC (2,439 KB)
[v2] Fri, 20 Oct 2023 15:51:45 UTC (2,443 KB)

Computer Science > Computation and Language

Title:BioDEX: Large-Scale Biomedical Adverse Drug Event Extraction for Real-World Pharmacovigilance

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:BioDEX: Large-Scale Biomedical Adverse Drug Event Extraction for Real-World Pharmacovigilance

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators