1998 | Yann Lecun, Léon Bottou, Yoshua Bengio, Patrick Haffner
The paper "Gradient-based Learning Applied to Document Recognition" by Yann Lecun, Léon Bottou, Yoshua Bengio, and Patrick Haffner reviews and compares various methods for handwritten character recognition, focusing on a standard handwritten digit recognition task. The authors highlight the effectiveness of multilayer neural networks (NNs) trained with back-propagation, particularly convolutional neural networks (CNNs), which are designed to handle the variability of two-dimensional shapes. CNNs are shown to outperform other techniques in this domain.
The paper also discusses the challenges of real-life document recognition systems, which involve multiple modules such as field extraction, segmentation, recognition, and language modeling. It introduces a new learning paradigm called graph transformer networks (GTNs), which allow these multimodule systems to be trained globally using gradient-based methods to minimize an overall performance measure.
Two systems for online handwriting recognition are described, demonstrating the advantages of global training and the flexibility of GTNs. Additionally, a GTN-based system for reading bank checks is presented, which uses convolutional NN character recognizers combined with global training techniques to achieve high accuracy. This system is commercially deployed and reads millions of checks daily.
The authors emphasize that better pattern recognition systems can be built by relying more on automatic learning and less on hand-designed heuristics, leveraging recent progress in machine learning and computer technology. They provide detailed explanations of gradient-based learning, including the back-propagation algorithm and its application to complex machine learning tasks. The paper also explores the use of GTNs in training multiple modules to optimize a global performance criterion, making it a comprehensive review of gradient-based learning techniques in document recognition.The paper "Gradient-based Learning Applied to Document Recognition" by Yann Lecun, Léon Bottou, Yoshua Bengio, and Patrick Haffner reviews and compares various methods for handwritten character recognition, focusing on a standard handwritten digit recognition task. The authors highlight the effectiveness of multilayer neural networks (NNs) trained with back-propagation, particularly convolutional neural networks (CNNs), which are designed to handle the variability of two-dimensional shapes. CNNs are shown to outperform other techniques in this domain.
The paper also discusses the challenges of real-life document recognition systems, which involve multiple modules such as field extraction, segmentation, recognition, and language modeling. It introduces a new learning paradigm called graph transformer networks (GTNs), which allow these multimodule systems to be trained globally using gradient-based methods to minimize an overall performance measure.
Two systems for online handwriting recognition are described, demonstrating the advantages of global training and the flexibility of GTNs. Additionally, a GTN-based system for reading bank checks is presented, which uses convolutional NN character recognizers combined with global training techniques to achieve high accuracy. This system is commercially deployed and reads millions of checks daily.
The authors emphasize that better pattern recognition systems can be built by relying more on automatic learning and less on hand-designed heuristics, leveraging recent progress in machine learning and computer technology. They provide detailed explanations of gradient-based learning, including the back-propagation algorithm and its application to complex machine learning tasks. The paper also explores the use of GTNs in training multiple modules to optimize a global performance criterion, making it a comprehensive review of gradient-based learning techniques in document recognition.