Objective: We introduce Medical evidence Dependency (MD)-informed attention, a novel neuro-symbolic model for understanding free-text clinical trial publications with generalizability and interpretability.
Materials and methods: We trained one head in the multi-head self-attention model to attend to the Medical evidence Ddependency (MD) and to pass linguistic and domain knowledge on to later layers (MD informed). This MD-informed attention model was integrated into BioBERT and tested on 2 public machine reading comprehension benchmarks for clinical trial publications: Evidence Inference 2.0 and PubMedQA. We also curated a small set of recently published articles reporting randomized controlled trials on COVID-19 (coronavirus disease 2019) following the Evidence Inference 2.0 guidelines to evaluate the model's robustness to unseen data.
Results: The integration of MD-informed attention head improves BioBERT substantially in both benchmark tasks-as large as an increase of +30% in the F1 score-and achieves the new state-of-the-art performance on the Evidence Inference 2.0. It achieves 84% and 82% in overall accuracy and F1 score, respectively, on the unseen COVID-19 data.
Conclusions: MD-informed attention empowers neural reading comprehension models with interpretability and generalizability via reusable domain knowledge. Its compositionality can benefit any transformer-based architecture for machine reading comprehension of free-text medical evidence.
Keywords: machine reading comprehension; medical evidence computing; natural language understanding; transformer.
© The Author(s) 2021. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For permissions, please email: [email protected].