24 Jun 2019 | Hongyang Zhang, Yaodong Yu, Jiantao Jiao, Eric P. Xing, Laurent El Ghaoui, Michael I. Jordan
This paper presents a theoretical analysis of the trade-off between robustness and accuracy in adversarial defense. The authors decompose the robust error into the sum of natural error and boundary error, and provide a differentiable upper bound using classification-calibrated loss, which is shown to be the tightest possible upper bound uniform over all probability distributions and measurable predictors. Inspired by this theoretical analysis, they propose a new defense method called TRADES, which trades adversarial robustness off against accuracy. The proposed algorithm performs well experimentally on real-world datasets and was used to win first place in the NeurIPS 2018 Adversarial Vision Challenge, surpassing the runner-up approach by 11.41% in terms of mean ℓ₂ perturbation distance.
The paper also provides a theoretical guarantee for the trade-off between robustness and accuracy, showing that the robust error can be bounded tightly using two terms: one corresponding to the natural error measured by a surrogate loss function, and the other corresponding to how likely the input features are close to the ε-extension of the decision boundary. The authors then minimize the differentiable upper bound, leading to a new formulation of adversarial defense that inherits the benefits of scalability to large datasets and provides theoretical guarantees. The methodology is the foundation of their entry to the NeurIPS 2018 Adversarial Vision Challenge, where they won first place out of ~2,000 submissions.This paper presents a theoretical analysis of the trade-off between robustness and accuracy in adversarial defense. The authors decompose the robust error into the sum of natural error and boundary error, and provide a differentiable upper bound using classification-calibrated loss, which is shown to be the tightest possible upper bound uniform over all probability distributions and measurable predictors. Inspired by this theoretical analysis, they propose a new defense method called TRADES, which trades adversarial robustness off against accuracy. The proposed algorithm performs well experimentally on real-world datasets and was used to win first place in the NeurIPS 2018 Adversarial Vision Challenge, surpassing the runner-up approach by 11.41% in terms of mean ℓ₂ perturbation distance.
The paper also provides a theoretical guarantee for the trade-off between robustness and accuracy, showing that the robust error can be bounded tightly using two terms: one corresponding to the natural error measured by a surrogate loss function, and the other corresponding to how likely the input features are close to the ε-extension of the decision boundary. The authors then minimize the differentiable upper bound, leading to a new formulation of adversarial defense that inherits the benefits of scalability to large datasets and provides theoretical guarantees. The methodology is the foundation of their entry to the NeurIPS 2018 Adversarial Vision Challenge, where they won first place out of ~2,000 submissions.