Biomedical relation extraction method based on ensemble learning and attention mechanism

BMC Bioinformatics. 2024 Oct 18;25(1):333. doi: 10.1186/s12859-024-05951-y.

Abstract

Background: Relation extraction (RE) plays a crucial role in biomedical research as it is essential for uncovering complex semantic relationships between entities in textual data. Given the significance of RE in biomedical informatics and the increasing volume of literature, there is an urgent need for advanced computational models capable of accurately and efficiently extracting these relationships on a large scale.

Results: This paper proposes a novel approach, SARE, combining ensemble learning Stacking and attention mechanisms to enhance the performance of biomedical relation extraction. By leveraging multiple pre-trained models, SARE demonstrates improved adaptability and robustness across diverse domains. The attention mechanisms enable the model to capture and utilize key information in the text more accurately. SARE achieved performance improvements of 4.8, 8.7, and 0.8 percentage points on the PPI, DDI, and ChemProt datasets, respectively, compared to the original BERT variant and the domain-specific PubMedBERT model.

Conclusions: SARE offers a promising solution for improving the accuracy and efficiency of relation extraction tasks in biomedical research, facilitating advancements in biomedical informatics. The results suggest that combining ensemble learning with attention mechanisms is effective for extracting complex relationships from biomedical texts. Our code and data are publicly available at: https://github.com/GS233/Biomedical .

Keywords: Attention mechanism; BERT; Biomedical relation extraction; Deep learning; Stacking.

MeSH terms

  • Algorithms
  • Biomedical Research / methods
  • Computational Biology / methods
  • Data Mining* / methods
  • Machine Learning*
  • Natural Language Processing
  • Semantics