DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection

DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection

11 Jul 2022 | Hao Zhang, Feng Li, Shilong Liu, Lei Zhang, Hang Su, Jun Zhu, Lionel M. Ni, Heung-Yeung Shum
DINO (DETR with Improved DeNoising Anchor Boxes) is a state-of-the-art end-to-end object detector that improves upon previous DETR-like models in performance and efficiency. It introduces several novel techniques, including contrastive denoising training, mixed query selection, and a look forward twice scheme for box prediction. DINO achieves 49.4 AP in 12 epochs and 51.3 AP in 24 epochs on COCO with a ResNet-50 backbone and multi-scale features, outperforming the previous best DETR-like model, DN-DETR, by +6.0 AP and +2.7 AP, respectively. DINO also demonstrates strong scalability, achieving the best results on both COCO val2017 (63.2 AP) and test-dev (63.3 AP) benchmarks after pre-training on the Objects365 dataset with a SwinL backbone. The paper provides extensive ablation studies and experimental results to validate the effectiveness of these techniques.DINO (DETR with Improved DeNoising Anchor Boxes) is a state-of-the-art end-to-end object detector that improves upon previous DETR-like models in performance and efficiency. It introduces several novel techniques, including contrastive denoising training, mixed query selection, and a look forward twice scheme for box prediction. DINO achieves 49.4 AP in 12 epochs and 51.3 AP in 24 epochs on COCO with a ResNet-50 backbone and multi-scale features, outperforming the previous best DETR-like model, DN-DETR, by +6.0 AP and +2.7 AP, respectively. DINO also demonstrates strong scalability, achieving the best results on both COCO val2017 (63.2 AP) and test-dev (63.3 AP) benchmarks after pre-training on the Objects365 dataset with a SwinL backbone. The paper provides extensive ablation studies and experimental results to validate the effectiveness of these techniques.
Reach us at info@study.space