DACG: Dual Attention and Context Guidance model for radiology report generation

Wangyu Lang; Zhi Liu; Yijia Zhang

doi:10.1016/j.media.2024.103377

DACG: Dual Attention and Context Guidance model for radiology report generation

Med Image Anal. 2024 Oct 23:99:103377. doi: 10.1016/j.media.2024.103377. Online ahead of print.

Authors

Wangyu Lang¹, Zhi Liu¹, Yijia Zhang²

Affiliations

¹ School of Information Science and Technology, Dalian Maritime University, Dalian 116026, Liaoning, China.
² School of Information Science and Technology, Dalian Maritime University, Dalian 116026, Liaoning, China. Electronic address: [email protected].

PMID: 39481215
DOI: 10.1016/j.media.2024.103377

Abstract

Medical images are an essential basis for radiologists to write radiology reports and greatly help subsequent clinical treatment. The task of generating automatic radiology reports aims to alleviate the burden of clinical doctors writing reports and has received increasing attention this year, becoming an important research hotspot. However, there are severe issues of visual and textual data bias and long text generation in the medical field. Firstly, Abnormal areas in radiological images only account for a small portion, and most radiological reports only involve descriptions of normal findings. Secondly, there are still significant challenges in generating longer and more accurate descriptive texts for radiology report generation tasks. In this paper, we propose a new Dual Attention and Context Guidance (DACG) model to alleviate visual and textual data bias and promote the generation of long texts. We use a Dual Attention Module, including a Position Attention Block and a Channel Attention Block, to extract finer position and channel features from medical images, enhancing the image feature extraction ability of the encoder. We use the Context Guidance Module to integrate contextual information into the decoder and supervise the generation of long texts. The experimental results show that our proposed model achieves state-of-the-art performance on the most commonly used IU X-ray and MIMIC-CXR datasets. Further analysis also proves that our model can improve reporting through more accurate anomaly detection and more detailed descriptions. The source code is available at https://github.com/LangWY/DACG.

Keywords: Context guidance; Dual attention; Radiology report generation.