February 2012 | Dan Cireşan, Ueli Meier, Jürgen Schmidhuber
This paper presents a multi-column deep neural network (MCDNN) for image classification, which achieves near-human performance on several benchmarks. The MCDNN combines multiple deep neural networks (DNNs) with different preprocessing methods, allowing for better error reduction. The architecture uses winner-take-all neurons with small receptive fields, enabling deep networks with many layers, similar to the human visual system. The DNNs are trained on GPUs, allowing for fast training and efficient computation. The MCDNN outperforms humans on traffic sign recognition by a factor of two and improves state-of-the-art results on various image classification benchmarks, including MNIST, NIST SD 19, Chinese characters, CIFAR10, and NORB. The method is fully supervised and does not require additional unlabeled data. The MCDNN achieves a low error rate on the MNIST benchmark, with a 0.23% error rate, improving the state of the art by at least 34%. On the traffic sign benchmark, the MCDNN achieves a 0.54% error rate, outperforming humans. The MCDNN also performs well on Chinese character recognition, achieving a 6.5% error rate, which is better than previous methods. The method is efficient and scalable, with training times reduced significantly by using GPUs. The MCDNN is shown to be robust and effective on a wide range of image classification tasks.This paper presents a multi-column deep neural network (MCDNN) for image classification, which achieves near-human performance on several benchmarks. The MCDNN combines multiple deep neural networks (DNNs) with different preprocessing methods, allowing for better error reduction. The architecture uses winner-take-all neurons with small receptive fields, enabling deep networks with many layers, similar to the human visual system. The DNNs are trained on GPUs, allowing for fast training and efficient computation. The MCDNN outperforms humans on traffic sign recognition by a factor of two and improves state-of-the-art results on various image classification benchmarks, including MNIST, NIST SD 19, Chinese characters, CIFAR10, and NORB. The method is fully supervised and does not require additional unlabeled data. The MCDNN achieves a low error rate on the MNIST benchmark, with a 0.23% error rate, improving the state of the art by at least 34%. On the traffic sign benchmark, the MCDNN achieves a 0.54% error rate, outperforming humans. The MCDNN also performs well on Chinese character recognition, achieving a 6.5% error rate, which is better than previous methods. The method is efficient and scalable, with training times reduced significantly by using GPUs. The MCDNN is shown to be robust and effective on a wide range of image classification tasks.