20 Nov 2019 | Ali Shafahi, Mahyar Najibi, Amin Ghiasi, Zheng Xu, John Dickerson, Christoph Studer, Larry S. Davis, Gavin Taylor, Tom Goldstein
Adversarial training, which involves training a neural network on adversarial examples, is a key defense against adversarial attacks. However, generating strong adversarial examples is computationally expensive, making standard adversarial training impractical for large-scale tasks like ImageNet. The authors propose a "free" adversarial training algorithm that eliminates the overhead of generating adversarial examples by reusing gradient information from model parameter updates. This method achieves comparable robustness to PGD adversarial training on CIFAR-10 and CIFAR-100 with minimal additional cost and is significantly faster than other strong adversarial training methods. Using a single workstation with 4 P100 GPUs and 2 days of runtime, they train a robust model for ImageNet classification that maintains 40% accuracy against PGD attacks.
The "free" adversarial training algorithm updates both model parameters and image perturbations using a single backward pass, rather than separate gradient computations for each update step. This approach has the same computational cost as conventional natural training and is 3-30 times faster than previous adversarial training methods. The algorithm is applied to the ImageNet classification task, achieving 40% accuracy against non-targeted PGD attacks. The method is the first to successfully train a robust model for ImageNet based on the non-targeted formulation and achieves results competitive with previous methods.
The algorithm is tested on CIFAR-10 and CIFAR-100, achieving robustness comparable to PGD adversarial training with minimal computational overhead. The method is also effective on ImageNet, achieving 43% robustness against PGD attacks with ε=2. The algorithm is efficient and can be combined with other defenses to produce robust models without a slowdown. The authors conclude that adversarial training remains a trusted defense, but its high computational cost limits its use. The "free" adversarial training method offers a cost-effective alternative that achieves strong robustness.Adversarial training, which involves training a neural network on adversarial examples, is a key defense against adversarial attacks. However, generating strong adversarial examples is computationally expensive, making standard adversarial training impractical for large-scale tasks like ImageNet. The authors propose a "free" adversarial training algorithm that eliminates the overhead of generating adversarial examples by reusing gradient information from model parameter updates. This method achieves comparable robustness to PGD adversarial training on CIFAR-10 and CIFAR-100 with minimal additional cost and is significantly faster than other strong adversarial training methods. Using a single workstation with 4 P100 GPUs and 2 days of runtime, they train a robust model for ImageNet classification that maintains 40% accuracy against PGD attacks.
The "free" adversarial training algorithm updates both model parameters and image perturbations using a single backward pass, rather than separate gradient computations for each update step. This approach has the same computational cost as conventional natural training and is 3-30 times faster than previous adversarial training methods. The algorithm is applied to the ImageNet classification task, achieving 40% accuracy against non-targeted PGD attacks. The method is the first to successfully train a robust model for ImageNet based on the non-targeted formulation and achieves results competitive with previous methods.
The algorithm is tested on CIFAR-10 and CIFAR-100, achieving robustness comparable to PGD adversarial training with minimal computational overhead. The method is also effective on ImageNet, achieving 43% robustness against PGD attacks with ε=2. The algorithm is efficient and can be combined with other defenses to produce robust models without a slowdown. The authors conclude that adversarial training remains a trusted defense, but its high computational cost limits its use. The "free" adversarial training method offers a cost-effective alternative that achieves strong robustness.