2024 | Zhiqi Pang, Chunyu Wang, Lingling Zhao, Yang Liu, and Gaurav Sharma
This paper addresses the challenging task of visible-infrared (VI) person re-identification (ReID), which involves matching images of the same person captured in different modalities. Unlike traditional ReID methods that focus on intra-modality initialization and cross-modality instance selection, the proposed method, Cross-Modality Hierarchical Clustering and Refinement (CHCR), is fully unsupervised and does not require any manual identity annotations or intra-modality initialization. CHCR consists of three main components: a cross-modality clustering baseline, cross-modality hierarchical clustering (CHC), and inter-channel pseudo-label refinement (IPR). The baseline reduces the modality gap by converting visible images to grayscale and applying linear and gamma transformations. CHC improves the clustering of inter-modality positive samples into the same cluster, while IPR refines pseudo-labels by checking the clustering results across the three RGB channels of visible images. Extensive experiments on two standard benchmarks, SYSU-MM01 and RegDB, demonstrate that CHCR outperforms existing unsupervised methods and achieves competitive performance with many supervised methods. The proposed method effectively addresses the modality gap and enhances the reliability of pseudo-labels, making it a promising solution for real-world VI-ReID applications.This paper addresses the challenging task of visible-infrared (VI) person re-identification (ReID), which involves matching images of the same person captured in different modalities. Unlike traditional ReID methods that focus on intra-modality initialization and cross-modality instance selection, the proposed method, Cross-Modality Hierarchical Clustering and Refinement (CHCR), is fully unsupervised and does not require any manual identity annotations or intra-modality initialization. CHCR consists of three main components: a cross-modality clustering baseline, cross-modality hierarchical clustering (CHC), and inter-channel pseudo-label refinement (IPR). The baseline reduces the modality gap by converting visible images to grayscale and applying linear and gamma transformations. CHC improves the clustering of inter-modality positive samples into the same cluster, while IPR refines pseudo-labels by checking the clustering results across the three RGB channels of visible images. Extensive experiments on two standard benchmarks, SYSU-MM01 and RegDB, demonstrate that CHCR outperforms existing unsupervised methods and achieves competitive performance with many supervised methods. The proposed method effectively addresses the modality gap and enhances the reliability of pseudo-labels, making it a promising solution for real-world VI-ReID applications.