BigGait: Learning Gait Representation You Want by Large Vision Models

BigGait: Learning Gait Representation You Want by Large Vision Models

22 Mar 2024 | Dingqiang Ye, Chao Fan, Jingze Ma, Xiaoming Liu, and Shiqi Yu
BigGait: Learning Gait Representation You Want by Large Vision Models This paper proposes BigGait, a novel gait recognition framework that leverages large vision models (LVMs) to learn gait representations without relying on task-specific supervision. Traditional gait recognition methods depend on supervised learning to generate explicit gait representations like silhouette sequences, which are costly and prone to error accumulation. In contrast, BigGait uses task-agnostic LVMs to generate implicit gait representations, offering a more efficient and practical approach. BigGait consists of three main components: an upstream model for feature extraction, a central Gait Representation Extractor (GRE) for transforming features into gait representations, and a downstream model for gait metric learning. The GRE includes three branches: a mask branch for background removal, an appearance branch for feature transformation, and a denoising branch for noise reduction. These branches work together to extract effective gait representations that are robust to gait-irrelevant noise. Experiments on CCPG, CAISA-B*, and SUSTech1K datasets show that BigGait significantly outperforms previous methods in both within-domain and cross-domain tasks. The framework is also effective in cross-domain scenarios, demonstrating its adaptability to different gait conditions. The paper also discusses challenges and future directions for LVM-based gait recognition, including interpretability and purity of gait representations. BigGait's approach is promising for learning next-generation gait representations, as it leverages the generalizability of LVMs and avoids the need for task-specific supervision. The source code is available at https://github.com/ShiqiYu/OpenGait.BigGait: Learning Gait Representation You Want by Large Vision Models This paper proposes BigGait, a novel gait recognition framework that leverages large vision models (LVMs) to learn gait representations without relying on task-specific supervision. Traditional gait recognition methods depend on supervised learning to generate explicit gait representations like silhouette sequences, which are costly and prone to error accumulation. In contrast, BigGait uses task-agnostic LVMs to generate implicit gait representations, offering a more efficient and practical approach. BigGait consists of three main components: an upstream model for feature extraction, a central Gait Representation Extractor (GRE) for transforming features into gait representations, and a downstream model for gait metric learning. The GRE includes three branches: a mask branch for background removal, an appearance branch for feature transformation, and a denoising branch for noise reduction. These branches work together to extract effective gait representations that are robust to gait-irrelevant noise. Experiments on CCPG, CAISA-B*, and SUSTech1K datasets show that BigGait significantly outperforms previous methods in both within-domain and cross-domain tasks. The framework is also effective in cross-domain scenarios, demonstrating its adaptability to different gait conditions. The paper also discusses challenges and future directions for LVM-based gait recognition, including interpretability and purity of gait representations. BigGait's approach is promising for learning next-generation gait representations, as it leverages the generalizability of LVMs and avoids the need for task-specific supervision. The source code is available at https://github.com/ShiqiYu/OpenGait.
Reach us at info@study.space
[slides and audio] BigGait%3A Learning Gait Representation You Want by Large Vision Models