Benchmarking Micro-action Recognition: Dataset, Methods, and Applications

Guo, Dan; Li, Kun; Hu, Bin; Zhang, Yan; Wang, Meng

doi:10.1109/TCSVT.2024.3358415

Abstract:Micro-action is an imperceptible non-verbal behaviour characterised by low-intensity movement. It offers insights into the feelings and intentions of individuals and is important for human-oriented applications such as emotion recognition and psychological assessment. However, the identification, differentiation, and understanding of micro-actions pose challenges due to the imperceptible and inaccessible nature of these subtle human behaviors in everyday life. In this study, we innovatively collect a new micro-action dataset designated as Micro-action-52 (MA-52), and propose a benchmark named micro-action network (MANet) for micro-action recognition (MAR) task. Uniquely, MA-52 provides the whole-body perspective including gestures, upper- and lower-limb movements, attempting to reveal comprehensive micro-action cues. In detail, MA-52 contains 52 micro-action categories along with seven body part labels, and encompasses a full array of realistic and natural micro-actions, accounting for 205 participants and 22,422 video instances collated from the psychological interviews. Based on the proposed dataset, we assess MANet and other nine prevalent action recognition methods. MANet incorporates squeeze-and excitation (SE) and temporal shift module (TSM) into the ResNet architecture for modeling the spatiotemporal characteristics of micro-actions. Then a joint-embedding loss is designed for semantic matching between video and action labels; the loss is used to better distinguish between visually similar yet distinct micro-action categories. The extended application in emotion recognition has demonstrated one of the important values of our proposed dataset and method. In the future, further exploration of human behaviour, emotion, and psychological assessment will be conducted in depth. The dataset and source code are released at this https URL.

Comments:	Accepted by IEEE Transactions on Circuits and Systems for Video Technology
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2403.05234 [cs.CV]
	(or arXiv:2403.05234v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2403.05234
Related DOI:	https://doi.org/10.1109/TCSVT.2024.3358415

Computer Science > Computer Vision and Pattern Recognition

Title:Benchmarking Micro-action Recognition: Dataset, Methods, and Applications

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators