Semantics, Distortion, and Style Matter: Towards Source-free UDA for Panoramic Segmentation

Semantics, Distortion, and Style Matter: Towards Source-free UDA for Panoramic Segmentation

22 Mar 2024 | Xu Zheng, Pengyuan Zhou, Athanasios V. Vasilakos, Lin Wang
This paper presents a novel source-free unsupervised domain adaptation (SFUDA) method for panoramic semantic segmentation, addressing the challenges of semantic mismatches, style discrepancies, and distortion between pinhole and panoramic images. The proposed method, named 360SFUDA, leverages Tangent Projection (TP) and Fixed FoV Projection (FFP) to extract knowledge from a pinhole-trained source model and adapt it to panoramic images. TP reduces distortion and mimics pinhole images, while FFP divides ERP images into fixed FoV patches to extract knowledge with similar FoV to pinhole images. A Panoramic Prototype Adaptation Module (PPAM) is introduced to integrate panoramic prototypes from the source model for adaptation. The method also proposes a Cross-Dual Attention Module (CDAM) to align spatial and channel characteristics across domains and projections, enhancing performance. Extensive experiments on synthetic and real-world benchmarks demonstrate that the proposed method significantly outperforms existing SFUDA methods, achieving better performance in both outdoor and indoor scenarios. The method is effective in handling the unique challenges of panoramic segmentation, including distortion and semantic mismatch, and shows strong adaptability across different projection formats. The results indicate that the proposed approach is more suitable for panoramic semantic segmentation tasks compared to existing methods.This paper presents a novel source-free unsupervised domain adaptation (SFUDA) method for panoramic semantic segmentation, addressing the challenges of semantic mismatches, style discrepancies, and distortion between pinhole and panoramic images. The proposed method, named 360SFUDA, leverages Tangent Projection (TP) and Fixed FoV Projection (FFP) to extract knowledge from a pinhole-trained source model and adapt it to panoramic images. TP reduces distortion and mimics pinhole images, while FFP divides ERP images into fixed FoV patches to extract knowledge with similar FoV to pinhole images. A Panoramic Prototype Adaptation Module (PPAM) is introduced to integrate panoramic prototypes from the source model for adaptation. The method also proposes a Cross-Dual Attention Module (CDAM) to align spatial and channel characteristics across domains and projections, enhancing performance. Extensive experiments on synthetic and real-world benchmarks demonstrate that the proposed method significantly outperforms existing SFUDA methods, achieving better performance in both outdoor and indoor scenarios. The method is effective in handling the unique challenges of panoramic segmentation, including distortion and semantic mismatch, and shows strong adaptability across different projection formats. The results indicate that the proposed approach is more suitable for panoramic semantic segmentation tasks compared to existing methods.
Reach us at info@study.space