Libra R-CNN: Towards Balanced Learning for Object Detection

Libra R-CNN: Towards Balanced Learning for Object Detection

4 Apr 2019 | Jiangmiao Pang, Kai Chen, Jianping Shi, Huajun Feng, Wanli Ouyang, Dahua Lin
Libra R-CNN: Towards Balanced Learning for Object Detection This paper presents Libra R-CNN, a framework for object detection that addresses the imbalance in the training process, which limits detection performance. The training process is typically imbalanced at three levels: sample level, feature level, and objective level. To mitigate these issues, Libra R-CNN integrates three novel components: IoU-balanced sampling, balanced feature pyramid, and balanced L1 loss. IoU-balanced sampling selects hard samples based on their overlap with ground-truth, balancing the sample level. The balanced feature pyramid strengthens multi-level features using balanced semantic features, improving the feature level. The balanced L1 loss promotes crucial gradients, rebalancing the objective level. Libra R-CNN achieves significant improvements in detection performance. On the MS COCO dataset, it outperforms FPN Faster R-CNN and RetinaNet, achieving 2.5 and 2.0 points higher Average Precision (AP) respectively. With the 1× schedule, Libra R-CNN achieves 38.7 and 43.0 AP with ResNet-50 and ResNeXt-101-64x4d respectively. The framework is tested on MS COCO and consistently improves performance over state-of-the-art detectors, including both single-stage and two-stage ones. The method is effective in addressing the imbalance at all three levels. IoU-balanced sampling ensures that hard samples are more likely to be selected, improving the sample level. The balanced feature pyramid ensures that multi-level features are balanced, improving the feature level. The balanced L1 loss ensures that the objective level is balanced, improving the overall performance. Extensive experiments show that Libra R-CNN generalizes well to various backbones for both two-stage and single-stage detectors. The framework is effective in improving the detection performance by addressing the imbalance in the training process. The results show that the overall balanced design of Libra R-CNN significantly improves the detection performance.Libra R-CNN: Towards Balanced Learning for Object Detection This paper presents Libra R-CNN, a framework for object detection that addresses the imbalance in the training process, which limits detection performance. The training process is typically imbalanced at three levels: sample level, feature level, and objective level. To mitigate these issues, Libra R-CNN integrates three novel components: IoU-balanced sampling, balanced feature pyramid, and balanced L1 loss. IoU-balanced sampling selects hard samples based on their overlap with ground-truth, balancing the sample level. The balanced feature pyramid strengthens multi-level features using balanced semantic features, improving the feature level. The balanced L1 loss promotes crucial gradients, rebalancing the objective level. Libra R-CNN achieves significant improvements in detection performance. On the MS COCO dataset, it outperforms FPN Faster R-CNN and RetinaNet, achieving 2.5 and 2.0 points higher Average Precision (AP) respectively. With the 1× schedule, Libra R-CNN achieves 38.7 and 43.0 AP with ResNet-50 and ResNeXt-101-64x4d respectively. The framework is tested on MS COCO and consistently improves performance over state-of-the-art detectors, including both single-stage and two-stage ones. The method is effective in addressing the imbalance at all three levels. IoU-balanced sampling ensures that hard samples are more likely to be selected, improving the sample level. The balanced feature pyramid ensures that multi-level features are balanced, improving the feature level. The balanced L1 loss ensures that the objective level is balanced, improving the overall performance. Extensive experiments show that Libra R-CNN generalizes well to various backbones for both two-stage and single-stage detectors. The framework is effective in improving the detection performance by addressing the imbalance in the training process. The results show that the overall balanced design of Libra R-CNN significantly improves the detection performance.
Reach us at info@study.space
[slides and audio] Libra R-CNN%3A Towards Balanced Learning for Object Detection