20 Sep 2013 | Ian J. Goodfellow, David Warde-Farley, Mehdi Mirza, Aaron Courville, Yoshua Bengio
The paper introduces the Maxout network, a new activation function designed to enhance the performance of dropout in deep learning models. Maxout networks are characterized by their ability to approximate arbitrary convex functions through piecewise linear functions, making them highly flexible and effective for model averaging. The authors demonstrate that Maxout networks achieve state-of-the-art performance on several benchmark datasets, including MNIST, CIFAR-10, CIFAR-100, and SVHN. They also provide theoretical and empirical evidence that Maxout networks are well-suited for dropout due to their local linearity and the ability to exploit dropout's approximate model averaging technique. Additionally, the paper compares Maxout networks to rectified linear units (ReLU) and shows that Maxout networks outperform ReLU in terms of optimization and generalization. The authors conclude by highlighting the potential of Maxout networks for further advancements in deep learning models.The paper introduces the Maxout network, a new activation function designed to enhance the performance of dropout in deep learning models. Maxout networks are characterized by their ability to approximate arbitrary convex functions through piecewise linear functions, making them highly flexible and effective for model averaging. The authors demonstrate that Maxout networks achieve state-of-the-art performance on several benchmark datasets, including MNIST, CIFAR-10, CIFAR-100, and SVHN. They also provide theoretical and empirical evidence that Maxout networks are well-suited for dropout due to their local linearity and the ability to exploit dropout's approximate model averaging technique. Additionally, the paper compares Maxout networks to rectified linear units (ReLU) and shows that Maxout networks outperform ReLU in terms of optimization and generalization. The authors conclude by highlighting the potential of Maxout networks for further advancements in deep learning models.