Cross-Modality Hierarchical Clustering and Refinement for Unsupervised Visible-Infrared Person Re-Identification

Cross-Modality Hierarchical Clustering and Refinement for Unsupervised Visible-Infrared Person Re-Identification

April 2024 | Zhiqi Pang, Chunyu Wang, Lingling Zhao, Yang Liu, and Gaurav Sharma
This paper proposes a fully unsupervised visible-infrared person re-identification (VI-ReID) method called cross-modality hierarchical clustering and refinement (CHCR). Unlike existing methods that rely on intra-modality initialization and cross-modality instance selection, CHCR focuses on cross-modality clustering to learn modality-invariant features and improve pseudo-label reliability. The method first reduces the modality gap by converting visible images to grayscale and applying gamma transformation. It then uses a cross-modality clustering baseline with DBSCAN to generate pseudo-labels. To enhance clustering, CHCR introduces cross-modality hierarchical clustering (CHC), which clusters inter-modality positive samples into the same cluster, and inter-channel pseudo-label refinement (IPR), which improves pseudo-label reliability by checking clustering consistency across RGB channels. The method also incorporates modality contrastive loss to align feature distributions between modalities. Extensive experiments on SYSU-MM01 and RegDB datasets show that CHCR outperforms state-of-the-art unsupervised methods and achieves performance competitive with many supervised methods. The method is fully unsupervised, requiring no labeled source domain or intra-modality initialization, and demonstrates significant improvements in performance and efficiency compared to existing approaches.This paper proposes a fully unsupervised visible-infrared person re-identification (VI-ReID) method called cross-modality hierarchical clustering and refinement (CHCR). Unlike existing methods that rely on intra-modality initialization and cross-modality instance selection, CHCR focuses on cross-modality clustering to learn modality-invariant features and improve pseudo-label reliability. The method first reduces the modality gap by converting visible images to grayscale and applying gamma transformation. It then uses a cross-modality clustering baseline with DBSCAN to generate pseudo-labels. To enhance clustering, CHCR introduces cross-modality hierarchical clustering (CHC), which clusters inter-modality positive samples into the same cluster, and inter-channel pseudo-label refinement (IPR), which improves pseudo-label reliability by checking clustering consistency across RGB channels. The method also incorporates modality contrastive loss to align feature distributions between modalities. Extensive experiments on SYSU-MM01 and RegDB datasets show that CHCR outperforms state-of-the-art unsupervised methods and achieves performance competitive with many supervised methods. The method is fully unsupervised, requiring no labeled source domain or intra-modality initialization, and demonstrates significant improvements in performance and efficiency compared to existing approaches.
Reach us at info@study.space
[slides and audio] Cross-Modality Hierarchical Clustering and Refinement for Unsupervised Visible-Infrared Person Re-Identification