12 Jan 2020 | Eric Wong*, Leslie Rice*, J. Zico Kolter
This paper revisits the method of adversarial training, which is typically assumed to be more computationally expensive than traditional training due to the need to construct adversarial examples using methods like projected gradient descent (PGD). The authors discover that using a weaker and cheaper adversary, specifically the Fast Gradient Sign Method (FGSM), combined with random initialization, can achieve similar effectiveness to PGD-based training but with significantly lower computational cost. They demonstrate that FGSM adversarial training can be further accelerated by employing standard techniques for efficient deep network training, such as cyclic learning rates and mixed-precision arithmetic. This approach allows for the rapid training of robust classifiers, achieving 45% robust accuracy on CIFAR10 and 43% robust accuracy on ImageNet with ε = 8/255 and ε = 2/255, respectively, in 6 minutes and 12 hours, compared to previous methods that took 80 hours and 50 hours, respectively. The paper also identifies a failure mode called "catastrophic overfitting," which may have caused previous attempts at FGSM adversarial training to fail. The authors provide detailed experimental results and discuss the implications of their findings for the field of adversarial robustness.This paper revisits the method of adversarial training, which is typically assumed to be more computationally expensive than traditional training due to the need to construct adversarial examples using methods like projected gradient descent (PGD). The authors discover that using a weaker and cheaper adversary, specifically the Fast Gradient Sign Method (FGSM), combined with random initialization, can achieve similar effectiveness to PGD-based training but with significantly lower computational cost. They demonstrate that FGSM adversarial training can be further accelerated by employing standard techniques for efficient deep network training, such as cyclic learning rates and mixed-precision arithmetic. This approach allows for the rapid training of robust classifiers, achieving 45% robust accuracy on CIFAR10 and 43% robust accuracy on ImageNet with ε = 8/255 and ε = 2/255, respectively, in 6 minutes and 12 hours, compared to previous methods that took 80 hours and 50 hours, respectively. The paper also identifies a failure mode called "catastrophic overfitting," which may have caused previous attempts at FGSM adversarial training to fail. The authors provide detailed experimental results and discuss the implications of their findings for the field of adversarial robustness.