Transformer for Object Re-Identification: A Survey

Transformer for Object Re-Identification: A Survey

13 Jan 2024 | Mang Ye, Shuoyi Chen, Chenyue Li, Wei-Shi Zheng, David Crandall, Bo Du
This survey provides a comprehensive review and in-depth analysis of Transformer-based Object Re-Identification (Re-ID). It categorizes existing works into four main areas: Image/Video-Based Re-ID, Re-ID with limited data/annotations, Cross-Modal Re-ID, and Special Re-ID Scenarios. The paper highlights the advantages of Transformers in addressing various challenges in these domains, including complex backgrounds, occlusions, and diverse perspectives. A new Transformer baseline, UntransReID, is proposed for unsupervised Re-ID, achieving state-of-the-art performance on both single- and cross-modal tasks. The survey also covers animal Re-ID, where a standardized benchmark is developed to evaluate the applicability of Transformers. It discusses open issues in the big foundation model era and emphasizes the potential of Transformers for future research. The paper compares the strengths of Transformers with CNNs in terms of network architecture, modeling capabilities, scalability, flexibility, and special properties. It also explores the application of Transformers in cross-modal Re-ID, including visible-infrared, text-image, and sketch-image scenarios. The survey concludes that Transformers have shown significant potential in Re-ID tasks, outperforming CNN-based methods in many aspects.This survey provides a comprehensive review and in-depth analysis of Transformer-based Object Re-Identification (Re-ID). It categorizes existing works into four main areas: Image/Video-Based Re-ID, Re-ID with limited data/annotations, Cross-Modal Re-ID, and Special Re-ID Scenarios. The paper highlights the advantages of Transformers in addressing various challenges in these domains, including complex backgrounds, occlusions, and diverse perspectives. A new Transformer baseline, UntransReID, is proposed for unsupervised Re-ID, achieving state-of-the-art performance on both single- and cross-modal tasks. The survey also covers animal Re-ID, where a standardized benchmark is developed to evaluate the applicability of Transformers. It discusses open issues in the big foundation model era and emphasizes the potential of Transformers for future research. The paper compares the strengths of Transformers with CNNs in terms of network architecture, modeling capabilities, scalability, flexibility, and special properties. It also explores the application of Transformers in cross-modal Re-ID, including visible-infrared, text-image, and sketch-image scenarios. The survey concludes that Transformers have shown significant potential in Re-ID tasks, outperforming CNN-based methods in many aspects.
Reach us at info@study.space