8 Jan 2024 | Chuyang Zhao, Yifan Sun, Wenhao Wang, Qiang Chen, Errui Ding, Yi Yang, Jingdong Wang
MS-DETR introduces an efficient training method for DETR by combining one-to-one and one-to-many supervision. The approach, named MS-DETR, enhances object detection by applying one-to-many supervision to the primary decoder's object queries, improving candidate generation without adding extra decoder branches or queries. This method outperforms existing DETR variants like DN-DETR, Hybrid DETR, and Group DETR, and further improves performance when combined with these methods. MS-DETR is more computationally and memory-efficient than alternatives, as it does not require additional decoder structures. The method improves candidate quality through mixed supervision, leading to better detection results. Experiments show that MS-DETR achieves significant improvements in mAP compared to other DETR variants, with consistent performance gains across different training schedules. The approach is also complementary to IoU-aware loss methods, enhancing detection performance. MS-DETR is applied to instance segmentation, achieving notable improvements in mask mAP. The method is effective in enhancing candidate quality and is efficient in terms of computation and memory usage.MS-DETR introduces an efficient training method for DETR by combining one-to-one and one-to-many supervision. The approach, named MS-DETR, enhances object detection by applying one-to-many supervision to the primary decoder's object queries, improving candidate generation without adding extra decoder branches or queries. This method outperforms existing DETR variants like DN-DETR, Hybrid DETR, and Group DETR, and further improves performance when combined with these methods. MS-DETR is more computationally and memory-efficient than alternatives, as it does not require additional decoder structures. The method improves candidate quality through mixed supervision, leading to better detection results. Experiments show that MS-DETR achieves significant improvements in mAP compared to other DETR variants, with consistent performance gains across different training schedules. The approach is also complementary to IoU-aware loss methods, enhancing detection performance. MS-DETR is applied to instance segmentation, achieving notable improvements in mask mAP. The method is effective in enhancing candidate quality and is efficient in terms of computation and memory usage.