Arbitrary-Oriented Scene Text Detection via Rotation Proposals

Arbitrary-Oriented Scene Text Detection via Rotation Proposals

15 Mar 2018 | Jianqi Ma, Weiyuan Shao, Hao Ye, Li Wang, Hong Wang, Yingbin Zheng, Xiangyang Xue
This paper introduces a novel rotation-based framework for arbitrary-oriented text detection in natural scene images. The framework, called Rotation Region Proposal Networks (RRPN), generates inclined proposals with text orientation angle information, which is then adapted for bounding box regression to improve the accuracy of text region fitting. A Rotation Region-of-Interest (RRoI) pooling layer is proposed to project arbitrary-oriented proposals onto a feature map for text region classification. The framework is built on a region-proposal-based architecture, ensuring computational efficiency compared to previous text detection systems. Experiments on three real-world scene text detection datasets (MSRA-TD500, ICDAR2013, ICDAR2015) demonstrate the framework's superiority in terms of effectiveness and efficiency over existing approaches. The paper addresses the challenge of detecting text in natural scene images, which is complex due to factors like uneven lighting, blurring, perspective distortion, and orientation. Previous methods often rely on horizontal or nearly horizontal annotations, which are insufficient for real-world applications where text regions are not horizontal. The proposed framework incorporates rotation information to generate proposals for arbitrary orientations, improving detection accuracy. The framework includes a rotation-based approach for generating proposals, a novel RRoI pooling layer, and a two-layer network for classification. The RRPN generates proposals with orientation information, which is then used for bounding box regression. The RRoI pooling layer projects arbitrary-oriented proposals onto a feature map, and a classifier determines whether a region is text or background. The paper also presents an ablation study on the MSRA-TD500 dataset, showing that incorporating rotation information improves detection performance. Additional experiments on the ICDAR2015 and ICDAR2013 datasets demonstrate the framework's effectiveness in detecting text with various orientations. The framework is compared with state-of-the-art approaches, showing superior performance in terms of precision, recall, and F-measure. The results indicate that the rotation-based framework is more robust and efficient for arbitrary-oriented text detection.This paper introduces a novel rotation-based framework for arbitrary-oriented text detection in natural scene images. The framework, called Rotation Region Proposal Networks (RRPN), generates inclined proposals with text orientation angle information, which is then adapted for bounding box regression to improve the accuracy of text region fitting. A Rotation Region-of-Interest (RRoI) pooling layer is proposed to project arbitrary-oriented proposals onto a feature map for text region classification. The framework is built on a region-proposal-based architecture, ensuring computational efficiency compared to previous text detection systems. Experiments on three real-world scene text detection datasets (MSRA-TD500, ICDAR2013, ICDAR2015) demonstrate the framework's superiority in terms of effectiveness and efficiency over existing approaches. The paper addresses the challenge of detecting text in natural scene images, which is complex due to factors like uneven lighting, blurring, perspective distortion, and orientation. Previous methods often rely on horizontal or nearly horizontal annotations, which are insufficient for real-world applications where text regions are not horizontal. The proposed framework incorporates rotation information to generate proposals for arbitrary orientations, improving detection accuracy. The framework includes a rotation-based approach for generating proposals, a novel RRoI pooling layer, and a two-layer network for classification. The RRPN generates proposals with orientation information, which is then used for bounding box regression. The RRoI pooling layer projects arbitrary-oriented proposals onto a feature map, and a classifier determines whether a region is text or background. The paper also presents an ablation study on the MSRA-TD500 dataset, showing that incorporating rotation information improves detection performance. Additional experiments on the ICDAR2015 and ICDAR2013 datasets demonstrate the framework's effectiveness in detecting text with various orientations. The framework is compared with state-of-the-art approaches, showing superior performance in terms of precision, recall, and F-measure. The results indicate that the rotation-based framework is more robust and efficient for arbitrary-oriented text detection.
Reach us at info@study.space
Understanding Arbitrary-Oriented Scene Text Detection via Rotation Proposals