14 Jun 2018 | Han Hu, Jiayuan Gu, Zheng Zhang, Jifeng Dai, Yichen Wei
This paper proposes an object relation module for object detection, which models the relationships between objects to improve object recognition and duplicate removal. The module processes multiple objects simultaneously using their appearance features and geometry, enabling the modeling of their relationships. It is lightweight, in-place, and does not require additional supervision, making it easy to integrate into existing networks. The module is shown to be effective in improving object recognition and duplicate removal in modern object detection pipelines. It also enables the first fully end-to-end object detector. The module is based on an attention mechanism, which is adapted for object detection by incorporating geometric relationships between objects. The module introduces a novel geometric weight to capture spatial relationships between objects, making it translation invariant. The module is applied to several state-of-the-art object detection architectures and shows consistent improvement. It is used to improve the instance recognition step and learn the duplicate removal step, resulting in the first end-to-end object detector. The module is general and can be applied to various vision tasks such as instance segmentation, action recognition, and captioning. The module is implemented in Python and is available at https://github.com/msracver/Relation-Networks-for-Object-Detection. The paper also compares the proposed method with existing approaches, including NMS and SoftNMS, and shows that the proposed method achieves better performance. The module is trained end-to-end and is shown to improve the accuracy of object detection. The paper also includes experiments on the COCO dataset, showing that the proposed method achieves state-of-the-art results.This paper proposes an object relation module for object detection, which models the relationships between objects to improve object recognition and duplicate removal. The module processes multiple objects simultaneously using their appearance features and geometry, enabling the modeling of their relationships. It is lightweight, in-place, and does not require additional supervision, making it easy to integrate into existing networks. The module is shown to be effective in improving object recognition and duplicate removal in modern object detection pipelines. It also enables the first fully end-to-end object detector. The module is based on an attention mechanism, which is adapted for object detection by incorporating geometric relationships between objects. The module introduces a novel geometric weight to capture spatial relationships between objects, making it translation invariant. The module is applied to several state-of-the-art object detection architectures and shows consistent improvement. It is used to improve the instance recognition step and learn the duplicate removal step, resulting in the first end-to-end object detector. The module is general and can be applied to various vision tasks such as instance segmentation, action recognition, and captioning. The module is implemented in Python and is available at https://github.com/msracver/Relation-Networks-for-Object-Detection. The paper also compares the proposed method with existing approaches, including NMS and SoftNMS, and shows that the proposed method achieves better performance. The module is trained end-to-end and is shown to improve the accuracy of object detection. The paper also includes experiments on the COCO dataset, showing that the proposed method achieves state-of-the-art results.