MultiPhys: Heterogeneous Fusion of Mamba and Transformer for Video-Based Multi-Task Physiological Measurement

Sensors (Basel). 2024 Dec 27;25(1):100. doi: 10.3390/s25010100.

Abstract

Due to its non-contact characteristics, remote photoplethysmography (rPPG) has attracted widespread attention in recent years, and has been widely applied for remote physiological measurements. However, most of the existing rPPG models are unable to estimate multiple physiological signals simultaneously, and the performance of the limited available multi-task models is also restricted due to their single-model architectures. To address the above problems, this study proposes MultiPhys, adopting a heterogeneous network fusion approach for its development. Specifically, a Convolutional Neural Network (CNN) is used to quickly extract local features in the early stage, a transformer captures global context and long-distance dependencies, and Mamba is used to compensate for the transformer's deficiencies, reducing the computational complexity and improving the accuracy of the model. Additionally, a gate is utilized for feature selection, which classifies the features of different physiological indicators. Finally, physiological indicators are estimated after passing features to each task-related head. Experiments on three datasets show that MultiPhys has superior performance in handling multiple tasks. The results of cross-dataset and hyper-parameter sensitivity tests also verify its generalization ability and robustness, respectively. MultiPhys can be considered as an effective solution for remote physiological estimation, thus promoting the development of this field.

Keywords: deep learning; multi-task physiological measurement; network fusion; remote photoplethysmography.

MeSH terms

  • Algorithms
  • Humans
  • Neural Networks, Computer*
  • Photoplethysmography* / methods
  • Signal Processing, Computer-Assisted
  • Video Recording / methods

Grants and funding

This research received no external funding.