1 May 2020 | Aman Sinha*1 Hongseok Namkoong*2 Riccardo Volpi3 John Duchi1,4
The paper addresses the issue of adversarial robustness in neural networks by adopting a distributionally robust optimization (DRO) approach. The authors propose a training procedure that augments model parameter updates with worst-case perturbations of training data, ensuring performance under adversarial input perturbations. For smooth losses, the procedure achieves moderate levels of robustness with minimal computational or statistical cost compared to empirical risk minimization. The method provides statistical guarantees for certifying robustness against the population loss, even for imperceptible perturbations, and matches or outperforms heuristic approaches. The paper also discusses the theoretical and empirical foundations of the approach, including the efficiency of the optimization algorithms and the generalization properties of the trained models.The paper addresses the issue of adversarial robustness in neural networks by adopting a distributionally robust optimization (DRO) approach. The authors propose a training procedure that augments model parameter updates with worst-case perturbations of training data, ensuring performance under adversarial input perturbations. For smooth losses, the procedure achieves moderate levels of robustness with minimal computational or statistical cost compared to empirical risk minimization. The method provides statistical guarantees for certifying robustness against the population loss, even for imperceptible perturbations, and matches or outperforms heuristic approaches. The paper also discusses the theoretical and empirical foundations of the approach, including the efficiency of the optimization algorithms and the generalization properties of the trained models.