6 Jan 2016 | Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun
The paper introduces the Region Proposal Network (RPN), a fully convolutional network that shares full-image convolutional features with the detection network, enabling nearly cost-free region proposals. The RPN predicts object bounds and objectness scores at each position, and is trained end-to-end to generate high-quality proposals. By merging the RPN and Fast R-CNN into a single network, the authors achieve a unified system with shared convolutional features. This system achieves a frame rate of 5fps on a GPU, with state-of-the-art accuracy on PASCAL VOC 2007, 2012, and MS COCO datasets. The method is also used in commercial systems and has won several competitions. The paper discusses the design of anchors for multi-scale predictions and the training of the RPN, and provides experimental results to validate the effectiveness of the proposed method.The paper introduces the Region Proposal Network (RPN), a fully convolutional network that shares full-image convolutional features with the detection network, enabling nearly cost-free region proposals. The RPN predicts object bounds and objectness scores at each position, and is trained end-to-end to generate high-quality proposals. By merging the RPN and Fast R-CNN into a single network, the authors achieve a unified system with shared convolutional features. This system achieves a frame rate of 5fps on a GPU, with state-of-the-art accuracy on PASCAL VOC 2007, 2012, and MS COCO datasets. The method is also used in commercial systems and has won several competitions. The paper discusses the design of anchors for multi-scale predictions and the training of the RPN, and provides experimental results to validate the effectiveness of the proposed method.