Understanding Detecting Text in Natural Image with Connectionist Text Proposal Network

The paper introduces the Connectionist Text Proposal Network (CTPN), a novel method for accurately localizing text lines in natural images. CTPN detects text lines by generating a sequence of fine-scale text proposals directly from convolutional feature maps. It employs a vertical anchor mechanism to jointly predict the location and text/non-text score of each proposal, improving localization accuracy. The proposals are connected sequentially using a recurrent neural network (RNN), allowing the model to capture rich context information. This end-to-end trainable model is efficient, achieving a running time of 0.14 seconds per image on the ICDAR 2013 benchmark. CTPN outperforms previous methods on various benchmarks, achieving state-of-the-art results with high F-measure scores. The method is robust to multi-scale and multi-language text, making it a powerful tool for scene text detection.The paper introduces the Connectionist Text Proposal Network (CTPN), a novel method for accurately localizing text lines in natural images. CTPN detects text lines by generating a sequence of fine-scale text proposals directly from convolutional feature maps. It employs a vertical anchor mechanism to jointly predict the location and text/non-text score of each proposal, improving localization accuracy. The proposals are connected sequentially using a recurrent neural network (RNN), allowing the model to capture rich context information. This end-to-end trainable model is efficient, achieving a running time of 0.14 seconds per image on the ICDAR 2013 benchmark. CTPN outperforms previous methods on various benchmarks, achieving state-of-the-art results with high F-measure scores. The method is robust to multi-scale and multi-language text, making it a powerful tool for scene text detection.

Detecting Text in Natural Image with Connectionist Text Proposal Network

2016 | Zhi Tian, Weilin Huang*, Tong He, Pan He, and Yu Qiao