Cross-Domain Few-Shot Object Detection via Enhanced Open-Set Object Detector

Cross-Domain Few-Shot Object Detection via Enhanced Open-Set Object Detector

16 Jul 2024 | Yuqian Fu, Yu Wang, Yixuan Pan, Lian Huai, Xingyu Qiu, Zeyu Shangguan, Tong Liu, Yanwei Fu, Luc Van Gool, Xingqun Jiang
This paper presents a novel approach for cross-domain few-shot object detection (CD-FSOD), aiming to develop an accurate object detector for novel domains with minimal labeled examples. The proposed method, CD-ViTO, builds upon the DE-ViT open-set detector and introduces several novel modules to address the challenges of cross-domain scenarios. The paper establishes a new benchmark for CD-FSOD by reorganizing existing object detection datasets and introducing metrics such as style, inter-class variance (ICV), and indefinable boundaries (IB) to measure domain differences. The benchmark includes COCO as the source dataset and six additional datasets as novel target datasets, showcasing significant variations in style, ICV, and IB. The paper evaluates various object detectors on this benchmark, revealing that even the state-of-the-art FSOD detector, DE-ViT, experiences performance degradation in CD-FSOD. To address this, the authors propose three novel modules: learnable instance features, instance reweighting, and domain prompter. Learnable instance features align initial fixed instances with target categories, enhancing feature distinctiveness. Instance reweighting assigns higher importance to high-quality instances with slight IB. The domain prompter encourages features resilient to different styles by synthesizing imaginary domains without altering semantic contents. These techniques collectively contribute to the development of CD-ViTO, significantly improving upon the base DE-ViT. The paper also explores the effectiveness of finetuning in CD-FSOD and demonstrates that CD-ViTO substantially boosts DE-ViT's performance across all target datasets. The results show that CD-ViTO outperforms other methods on most datasets, highlighting the effectiveness of the proposed modules. The paper concludes that CD-ViTO is a promising solution for CD-FSOD, addressing the challenges posed by domain gaps and improving the performance of open-set detectors in cross-domain scenarios.This paper presents a novel approach for cross-domain few-shot object detection (CD-FSOD), aiming to develop an accurate object detector for novel domains with minimal labeled examples. The proposed method, CD-ViTO, builds upon the DE-ViT open-set detector and introduces several novel modules to address the challenges of cross-domain scenarios. The paper establishes a new benchmark for CD-FSOD by reorganizing existing object detection datasets and introducing metrics such as style, inter-class variance (ICV), and indefinable boundaries (IB) to measure domain differences. The benchmark includes COCO as the source dataset and six additional datasets as novel target datasets, showcasing significant variations in style, ICV, and IB. The paper evaluates various object detectors on this benchmark, revealing that even the state-of-the-art FSOD detector, DE-ViT, experiences performance degradation in CD-FSOD. To address this, the authors propose three novel modules: learnable instance features, instance reweighting, and domain prompter. Learnable instance features align initial fixed instances with target categories, enhancing feature distinctiveness. Instance reweighting assigns higher importance to high-quality instances with slight IB. The domain prompter encourages features resilient to different styles by synthesizing imaginary domains without altering semantic contents. These techniques collectively contribute to the development of CD-ViTO, significantly improving upon the base DE-ViT. The paper also explores the effectiveness of finetuning in CD-FSOD and demonstrates that CD-ViTO substantially boosts DE-ViT's performance across all target datasets. The results show that CD-ViTO outperforms other methods on most datasets, highlighting the effectiveness of the proposed modules. The paper concludes that CD-ViTO is a promising solution for CD-FSOD, addressing the challenges posed by domain gaps and improving the performance of open-set detectors in cross-domain scenarios.
Reach us at info@study.space
[slides and audio] Cross-Domain Few-Shot Object Detection via Enhanced Open-Set Object Detector