TriGait: hybrid fusion strategy for multimodal alignment and integration in gait recognition

Due to the inherent limitations of single modalities, multimodal fusion has become increasingly popular in many computer vision fields, leveraging the complementary advantages of unimodal methods. As an emerging biometric technology with great application potential, gait recognition faces similar challenges. The prevailing silhouette-based and skeleton-based gait recognition methods have their respective limitations: one focuses on appearance information while neglecting structural details, and the other does the opposite. Multimodal gait recognition, which combines silhouette and skeleton, promises more robust predictions. However, it is essential and difficult to explore the implicit interaction between dense pixels and discrete coordinate points. Most existing multimodal gait recognition methods basically concatenated features from silhouette and skeleton and did not fully exploit complementarity between them. This paper presents a hybrid fusion strategy called TriGait, which is a three branch structural model and thoroughly explores the interaction and complementarity of the two modalities. To solve the problem of data heterogeneity and explore the mutual information of two modalities, we propose the use of a cross-modal token generator (CMTG) within a fusion branch to align and fuse the low-level features of the two modalities. Additionally, TriGait has two extra branches for extracting high-level semantic information from silhouette and skeleton. By combining low-level correlation information and high-level semantic information, TriGait provides a comprehensive and discriminative representation of a subject’s gait. Extensive experimental results on CASIA-B, Gait3D and OUMVLP demonstrate the effectiveness of TriGait. Remarkably, TriGait achieves the rank-1 mean accuracy of 96.6%, 61.4% and 91.1% on CASIA-B, Gait3D and OUMVLP respectively, outperforming the state-of-the-art methods. The source code will be available at: https://github.com/YanSun-github/TriGait/.

Face recognition, Feature extraction, Fuses, Gait recognition, Semantics, Skeleton, Three-dimensional displays, attention network, gait recognition, multi-modal fusion, spatial-temporal modeling

10.1109/TBIOM.2024.3435046

2637-6407

Sun, Yan

0e11df25-8ee0-4a0f-af6a-ae630267487a

Feng, Xueling

640a3b36-0b13-4fd3-bcd8-fd9122a69077

Liu, Xiaolei

3a0deecb-e3dc-4edc-a65a-5bfa6ccb7c80

Ma, Liyan

d1529371-bcdf-4c01-a237-f2a0282814ae

Hu, Long

21b0a54b-72cc-4853-91e2-59e172178494

Nixon, Mark

2b5b9804-5a81-462a-82e6-92ee5fa74e12

29 July 2024

Sun, Yan

0e11df25-8ee0-4a0f-af6a-ae630267487a

Feng, Xueling

640a3b36-0b13-4fd3-bcd8-fd9122a69077

Liu, Xiaolei

3a0deecb-e3dc-4edc-a65a-5bfa6ccb7c80

Ma, Liyan

d1529371-bcdf-4c01-a237-f2a0282814ae

Hu, Long

21b0a54b-72cc-4853-91e2-59e172178494

Nixon, Mark

2b5b9804-5a81-462a-82e6-92ee5fa74e12

Sun, Yan, Feng, Xueling, Liu, Xiaolei, Ma, Liyan, Hu, Long and Nixon, Mark (2024) TriGait: hybrid fusion strategy for multimodal alignment and integration in gait recognition. IEEE Transactions on Biometrics, Behavior, and Identity Science, 1, [TBIOM-2024-02-0015]. (doi:10.1109/TBIOM.2024.3435046).

Record type: Article

Abstract

Text

Final_TriGait__Hybrid_Fusion_Strategy_for_Multimodal_Alignment_and_Integration_in_Gait_Recognition - Accepted Manuscript

Available under License University of Southampton Accepted Manuscript Licence.

Download (2MB)

More information

Accepted/In Press date: 20 July 2024

e-pub ahead of print date: 29 July 2024

Published date: 29 July 2024

Keywords: Face recognition, Feature extraction, Fuses, Gait recognition, Semantics, Skeleton, Three-dimensional displays, attention network, gait recognition, multi-modal fusion, spatial-temporal modeling

Learn more about School of Electronics and Computer Science research