[slides and audio] Evaluation of Pooling Operations in Convolutional Architectures for Object Recognition

The paper "Evaluation of Pooling Operations in Convolutional Architectures for Object Recognition" by Dominik Scherer, Andreas Müller, and Sven Behnke from the University of Bonn, Germany, aims to compare different pooling operations to gain insights into their effectiveness in object recognition models. The authors focus on comparing maximum pooling and subsampling operations on a fixed architecture for various object recognition tasks. Empirical results show that maximum pooling significantly outperforms subsampling, and overlapping pooling windows do not provide a significant improvement over non-overlapping ones. Using this knowledge, the authors achieve state-of-the-art error rates of 4.57% on the NORB normalized-uniform dataset and 5.6% on the NORB jittered-cluttered dataset. The paper also discusses the historical context of object recognition models inspired by the mammal visual cortex, including the Neocognitron, Convolutional Neural Networks (CNNs), and other feature extractors like HOG, SIFT, and Gist features.The paper "Evaluation of Pooling Operations in Convolutional Architectures for Object Recognition" by Dominik Scherer, Andreas Müller, and Sven Behnke from the University of Bonn, Germany, aims to compare different pooling operations to gain insights into their effectiveness in object recognition models. The authors focus on comparing maximum pooling and subsampling operations on a fixed architecture for various object recognition tasks. Empirical results show that maximum pooling significantly outperforms subsampling, and overlapping pooling windows do not provide a significant improvement over non-overlapping ones. Using this knowledge, the authors achieve state-of-the-art error rates of 4.57% on the NORB normalized-uniform dataset and 5.6% on the NORB jittered-cluttered dataset. The paper also discusses the historical context of object recognition models inspired by the mammal visual cortex, including the Neocognitron, Convolutional Neural Networks (CNNs), and other feature extractors like HOG, SIFT, and Gist features.

Evaluation of Pooling Operations in Convolutional Architectures for Object Recognition

2010 | Dominik Scherer, Andreas Müller*, and Sven Behnke