Understanding BigGait%3A Learning Gait Representation You Want by Large Vision Models

**BigGait: Learning Gait Representation You Want by Large Vision Models** This paper addresses the challenge of gait recognition, a critical remote identification technology, by proposing a novel framework called BigGait. Traditional gait recognition methods rely heavily on task-specific supervised learning, which introduces high annotation costs and potential errors. To overcome these limitations, BigGait leverages the all-purpose knowledge produced by Large Vision Models (LVMs) to extract implicit gait representations without requiring explicit supervision. **Key Contributions:** 1. **BigGait Framework:** A novel gait recognition framework that transforms all-purpose knowledge from LVMs into effective gait representations. 2. **Gait Representation Extractor (GRE):** A module that includes three branches—Mask, Appearance, and Denoising—to remove background noise, transform features, and refine representations. 3. **Performance:** BigGait outperforms existing methods in both within-domain and cross-domain tasks on datasets like CCPG, CASIA-B*, and SUSTech1K. 4. **Challenges and Future Directions:** Discusses challenges in interpretability and purity of learned representations and suggests future research directions. **Methodology:** - **Upstream Model:** Utilizes DINOv2, a self-supervised LVM, to extract all-purpose features. - **Downstream Model:** Adjusted GaitBase for gait metric learning. - **GRE Module:** Comprises three branches to handle background removal, feature transformation, and feature refinement. - **Loss Functions:** Combines recognition losses, mask reconstruction loss, smoothness loss, and diversity loss to optimize the representation. **Experiments:** - **Datasets:** CPG, CASIA-B*, and SUSTech1K. - **Results:** BigGait achieves superior performance compared to video-based ReID methods and silhouette-based methods. - **Ablation Study:** Evaluates the effectiveness of each branch and the Pad-and-Resize strategy. **Conclusion:** BigGait provides a practical paradigm for learning next-generation gait representations, leveraging LVMs to reduce annotation costs and improve robustness to gait-irrelevant noises. The work highlights the potential of LVMs in gait recognition and opens new avenues for future research.**BigGait: Learning Gait Representation You Want by Large Vision Models** This paper addresses the challenge of gait recognition, a critical remote identification technology, by proposing a novel framework called BigGait. Traditional gait recognition methods rely heavily on task-specific supervised learning, which introduces high annotation costs and potential errors. To overcome these limitations, BigGait leverages the all-purpose knowledge produced by Large Vision Models (LVMs) to extract implicit gait representations without requiring explicit supervision. **Key Contributions:** 1. **BigGait Framework:** A novel gait recognition framework that transforms all-purpose knowledge from LVMs into effective gait representations. 2. **Gait Representation Extractor (GRE):** A module that includes three branches—Mask, Appearance, and Denoising—to remove background noise, transform features, and refine representations. 3. **Performance:** BigGait outperforms existing methods in both within-domain and cross-domain tasks on datasets like CCPG, CASIA-B*, and SUSTech1K. 4. **Challenges and Future Directions:** Discusses challenges in interpretability and purity of learned representations and suggests future research directions. **Methodology:** - **Upstream Model:** Utilizes DINOv2, a self-supervised LVM, to extract all-purpose features. - **Downstream Model:** Adjusted GaitBase for gait metric learning. - **GRE Module:** Comprises three branches to handle background removal, feature transformation, and feature refinement. - **Loss Functions:** Combines recognition losses, mask reconstruction loss, smoothness loss, and diversity loss to optimize the representation. **Experiments:** - **Datasets:** CPG, CASIA-B*, and SUSTech1K. - **Results:** BigGait achieves superior performance compared to video-based ReID methods and silhouette-based methods. - **Ablation Study:** Evaluates the effectiveness of each branch and the Pad-and-Resize strategy. **Conclusion:** BigGait provides a practical paradigm for learning next-generation gait representations, leveraging LVMs to reduce annotation costs and improve robustness to gait-irrelevant noises. The work highlights the potential of LVMs in gait recognition and opens new avenues for future research.

BigGait: Learning Gait Representation You Want by Large Vision Models

22 Mar 2024 | Dingqiang Ye1,2*, Chao Fan1,2*, Jingzhe Ma1,2, Xiaoming Liu3, and Shiqi Yu1,2†

22 Mar 2024 | Dingqiang Ye1,2, Chao Fan1,2, Jingzhe Ma1,2, Xiaoming Liu3, and Shiqi Yu1,2†