Multigranular visual-semantic embedding for cloth-changing person re-identification

Z Gao, H Wei, W Guan, W Nie, M Liu… - Proceedings of the 30th …, 2022 - dl.acm.org
Z Gao, H Wei, W Guan, W Nie, M Liu, M Wang
Proceedings of the 30th ACM international conference on multimedia, 2022dl.acm.org
To date, only a few works have focused on the cloth-changing person Re-identification
(ReID) task, but since it is very difficult to extract generalized and robust features for
representing people with different clothes, thus, their performances need to be improved.
Moreover, visual-semantic information is also often ignored. To solve these issues, in this
work, a novel multigranular visual-semantic embedding algorithm (MVSE) is proposed for
cloth-changing person ReID, where visual semantic information and human attributes are …
To date, only a few works have focused on the cloth-changing person Re-identification (ReID) task, but since it is very difficult to extract generalized and robust features for representing people with different clothes, thus, their performances need to be improved. Moreover, visual-semantic information is also often ignored. To solve these issues, in this work, a novel multigranular visual-semantic embedding algorithm (MVSE) is proposed for cloth-changing person ReID, where visual semantic information and human attributes are embedded into the network, and the generalized features of human appearance can be well learned to effectively solve the problem of cloth-changing. Specifically, to fully represent a person with clothing changes, a multigranular feature representation scheme (MGR) is employed to adaptively extract multilevel and multigranular feature information, and then a cloth desensitization network (CDN) is designed to improve the feature robustness for the person with different clothes, where different high-level human attributes are fully utilized. Moreover, to further solve the issue of pose changes and occlusion under different camera perspectives, a partially semantically aligned network (PSA) is proposed to obtain the visual-semantic information that is used to align the human attributes. Most importantly, these three modules are jointly explored in a unified framework. Extensive experimental results on four cloth-changing person ReID datasets demonstrate that the MVSE algorithm can extract highly robust feature representations of cloth-changing persons, and it can outperform state-of-the-art cloth-changing person ReID approaches.
ACM Digital Library