Accurate segmentation of polyp regions in gastrointestinal endoscopic images is pivotal for diagnosis and treatment. Despite advancements, challenges persist, like accurately segmenting small polyps and maintaining accuracy when polyps resemble surrounding tissues. Recent studies show the effectiveness of the pyramid vision transformer (PVT) in capturing global context, yet it may lack detailed information. Conversely, U-Net excels in semantic extraction. Hence, we propose the bilateral fusion enhanced network (BFE-Net) to address these challenges. Our model integrates U-Net and PVT features via a deep feature enhancement fusion module (FEF) and attention decoder module (AD). Experimental results demonstrate significant improvements, validating our model's effectiveness across various datasets and modalities, promising advancements in gastrointestinal polyp diagnosis and treatment.
© 2024 Optica Publishing Group.