Multi-resolution visual Mamba with multi-directional selective mechanism for retinal disease detection

Qiankun Zuo; Zhengkun Shi; Bo Liu; Na Ping; Jiangtao Wang; Xi Cheng; Kexin Zhang; Jia Guo; Yixian Wu; Jin Hong

doi:10.3389/fcell.2024.1484880

Multi-resolution visual Mamba with multi-directional selective mechanism for retinal disease detection

Front Cell Dev Biol. 2024 Oct 11:12:1484880. doi: 10.3389/fcell.2024.1484880. eCollection 2024.

Authors

Qiankun Zuo^{1

2

3}, Zhengkun Shi², Bo Liu⁴, Na Ping², Jiangtao Wang², Xi Cheng², Kexin Zhang², Jia Guo^{1

2

3}, Yixian Wu⁵, Jin Hong⁶

Affiliations

¹ Hubei Key Laboratory of Digital Finance Innovation, Hubei University of Economics, Wuhan, China.
² School of Information Engineering, Hubei University of Economics, Wuhan, China.
³ Hubei Internet Finance Information Engineering Technology Research Center, Hubei University of Economics, Wuhan, China.
⁴ School of Mathematics and Computer Science, Nanchang University, Nanchang, China.
⁵ School of Mechanical Engineering, Beijing Institute of Petrochemical Technology, Beijing, China.
⁶ School of Information Engineering, Nanchang University, Nanchang, China.

Abstract

Introduction: Retinal diseases significantly impact patients' quality of life and increase social medical costs. Optical coherence tomography (OCT) offers high-resolution imaging for precise detection and monitoring of these conditions. While deep learning techniques have been employed to extract features from OCT images for classification, convolutional neural networks (CNNs) often fail to capture global context due to their focus on local receptive fields. Transformer-based methods, on the other hand, suffer from quadratic complexity when handling long-range dependencies.

Methods: To overcome these limitations, we introduce the Multi-Resolution Visual Mamba (MRVM) model, which addresses long-range dependencies with linear computational complexity for OCT image classification. The MRVM model initially employs convolution to extract local features and subsequently utilizes the retinal Mamba to capture global dependencies. By integrating multi-scale global features, the MRVM enhances classification accuracy and overall performance. Additionally, the multi-directional selection mechanism (MSM) within the retinal Mamba improves feature extraction by concentrating on various directions, thereby better capturing complex, orientation-specific retinal patterns.

Results: Experimental results demonstrate that the MRVM model excels in differentiating retinal images with various lesions, achieving superior detection accuracy compared to traditional methods, with overall accuracies of 98.98\% and 96.21\% on two public datasets, respectively.

Discussion: This approach offers a novel perspective for accurately identifying retinal diseases and could contribute to the development of more robust artificial intelligence algorithms and recognition systems for medical image-assisted diagnosis.

Keywords: global–local feature; multi-directional selective learning; multi-scale fusion; retinal disease detection; state-space model.