Provable Defenses against Adversarial Examples via the Convex Outer Adversarial Polytope

Provable Defenses against Adversarial Examples via the Convex Outer Adversarial Polytope

8 Jun 2018 | Eric Wong, J. Zico Kolter
This paper presents a method to train deep ReLU-based classifiers that are provably robust against norm-bounded adversarial perturbations on the training data. The approach involves constructing a convex outer bound on the set of activations reachable through such perturbations and developing a robust optimization procedure that minimizes the worst-case loss over this outer region via a linear program. The dual problem of this linear program can be represented as a deep network, allowing for efficient optimization with a single backward pass through the network. This results in classifiers that are provably robust to any norm-bounded adversarial attack. The method is evaluated on various tasks, including MNIST, Fashion-MNIST, and human activity recognition, achieving significant improvements over traditional training methods in terms of robust error bounds and adversarial attack detection. The approach is also scalable to larger networks and provides provable guarantees on robustness, making it a substantial step forward in the field of adversarial defense.This paper presents a method to train deep ReLU-based classifiers that are provably robust against norm-bounded adversarial perturbations on the training data. The approach involves constructing a convex outer bound on the set of activations reachable through such perturbations and developing a robust optimization procedure that minimizes the worst-case loss over this outer region via a linear program. The dual problem of this linear program can be represented as a deep network, allowing for efficient optimization with a single backward pass through the network. This results in classifiers that are provably robust to any norm-bounded adversarial attack. The method is evaluated on various tasks, including MNIST, Fashion-MNIST, and human activity recognition, achieving significant improvements over traditional training methods in terms of robust error bounds and adversarial attack detection. The approach is also scalable to larger networks and provides provable guarantees on robustness, making it a substantial step forward in the field of adversarial defense.
Reach us at info@study.space
Understanding Provable defenses against adversarial examples via the convex outer adversarial polytope