[slides and audio] MagNet%3A A Two-Pronged Defense against Adversarial Examples

MagNet is a defense framework designed to protect neural network classifiers against adversarial examples. It consists of two main components: a detector and a reformer. The detector learns to differentiate between normal and adversarial examples by approximating the manifold of normal examples, while the reformer moves adversarial examples towards this manifold. MagNet does not modify the target classifier and requires no knowledge of the process for generating adversarial examples. It is effective against various attacks, including blackbox and graybox attacks, without significantly increasing the false positive rate on normal examples. The authors also propose using diversity to enhance the defense against graybox attacks, inspired by cryptographic techniques. Empirical results show that MagNet achieves high accuracy on adversarial examples generated by state-of-the-art attacks on datasets like MNIST and CIFAR-10.MagNet is a defense framework designed to protect neural network classifiers against adversarial examples. It consists of two main components: a detector and a reformer. The detector learns to differentiate between normal and adversarial examples by approximating the manifold of normal examples, while the reformer moves adversarial examples towards this manifold. MagNet does not modify the target classifier and requires no knowledge of the process for generating adversarial examples. It is effective against various attacks, including blackbox and graybox attacks, without significantly increasing the false positive rate on normal examples. The authors also propose using diversity to enhance the defense against graybox attacks, inspired by cryptographic techniques. Empirical results show that MagNet achieves high accuracy on adversarial examples generated by state-of-the-art attacks on datasets like MNIST and CIFAR-10.

MagNet: a Two-Pronged Defense against Adversarial Examples

11 Sep 2017 | Dongyu Meng, Hao Chen