May 9, 2008 | Alex Graves, Marcus Liwicki, Santiago Fernández, Roman Bertolami, Horst Bunke, Jürgen Schmidhuber
This paper presents a novel approach to unconstrained handwriting recognition using a recurrent neural network (RNN) with a connectionist temporal classification (CTC) output layer. The RNN, specifically designed for sequence labeling tasks with long-range bidirectional interdependencies, achieves word recognition accuracies of 79.7% on online data and 74.1% on offline data, significantly outperforming state-of-the-art hidden Markov model (HMM)-based systems. The network's robustness to lexicon size and the influence of its hidden layers are also demonstrated. The paper discusses the differences between the RNN and HMMs, suggesting reasons for the RNN's superior performance, including its ability to handle long-range context and avoid the limitations of HMMs such as the vanishing gradient problem. The experiments are conducted on two large handwriting databases, the IAMOnDB and the IAM-DB, and the results show that the RNN system consistently outperforms the HMM system, even with varying dictionary sizes and hidden layer configurations.This paper presents a novel approach to unconstrained handwriting recognition using a recurrent neural network (RNN) with a connectionist temporal classification (CTC) output layer. The RNN, specifically designed for sequence labeling tasks with long-range bidirectional interdependencies, achieves word recognition accuracies of 79.7% on online data and 74.1% on offline data, significantly outperforming state-of-the-art hidden Markov model (HMM)-based systems. The network's robustness to lexicon size and the influence of its hidden layers are also demonstrated. The paper discusses the differences between the RNN and HMMs, suggesting reasons for the RNN's superior performance, including its ability to handle long-range context and avoid the limitations of HMMs such as the vanishing gradient problem. The experiments are conducted on two large handwriting databases, the IAMOnDB and the IAM-DB, and the results show that the RNN system consistently outperforms the HMM system, even with varying dictionary sizes and hidden layer configurations.