27 Apr 2018 | Hongyi Zhang, Moustapha Cisse, Yann N. Dauphin, David Lopez-Paz
The paper "mixup: Beyond Empirical Risk Minimization" by Hongyi Zhang, Moustapha Cisse, Yann N. Dauphin, and David Lopez-Paz introduces a novel data augmentation technique called *mixup*. *mixup* trains neural networks on convex combinations of pairs of examples and their labels, promoting linear behavior between training examples. This approach helps to reduce memorization of corrupt labels, increase robustness to adversarial examples, and stabilize the training of generative adversarial networks (GANs). The authors demonstrate that *mixup* improves the generalization of state-of-the-art neural network architectures on various datasets, including ImageNet-2012, CIFAR-10, CIFAR-100, Google commands, and UCI datasets. They also conduct ablation studies to understand the effectiveness of different design choices in *mixup*. The paper concludes by discussing the connections to prior work and suggesting future directions for exploration.The paper "mixup: Beyond Empirical Risk Minimization" by Hongyi Zhang, Moustapha Cisse, Yann N. Dauphin, and David Lopez-Paz introduces a novel data augmentation technique called *mixup*. *mixup* trains neural networks on convex combinations of pairs of examples and their labels, promoting linear behavior between training examples. This approach helps to reduce memorization of corrupt labels, increase robustness to adversarial examples, and stabilize the training of generative adversarial networks (GANs). The authors demonstrate that *mixup* improves the generalization of state-of-the-art neural network architectures on various datasets, including ImageNet-2012, CIFAR-10, CIFAR-100, Google commands, and UCI datasets. They also conduct ablation studies to understand the effectiveness of different design choices in *mixup*. The paper concludes by discussing the connections to prior work and suggesting future directions for exploration.