[slides and audio] On Detecting Adversarial Perturbations

This paper addresses the issue of adversarial perturbations in machine learning, particularly deep learning, which can fool systems while remaining imperceptible to humans. The authors propose a method to augment deep neural networks with a small "detector" subnetwork that distinguishes between genuine data and data containing adversarial perturbations. The detector is trained on binary classification tasks, distinguishing between normal and adversarial data. The paper demonstrates that the detector can effectively detect adversarial perturbations, even though they are quasi-imperceptible to humans. Additionally, the detectors generalize well to similar and weaker adversaries. The authors also introduce an attack that combines both the classifier and the detector, and propose a novel training procedure for the detector to counteract this attack. The results show that the detector can achieve high accuracy in detecting adversarial inputs, making it a valuable tool for safety and security-critical applications.This paper addresses the issue of adversarial perturbations in machine learning, particularly deep learning, which can fool systems while remaining imperceptible to humans. The authors propose a method to augment deep neural networks with a small "detector" subnetwork that distinguishes between genuine data and data containing adversarial perturbations. The detector is trained on binary classification tasks, distinguishing between normal and adversarial data. The paper demonstrates that the detector can effectively detect adversarial perturbations, even though they are quasi-imperceptible to humans. Additionally, the detectors generalize well to similar and weaker adversaries. The authors also introduce an attack that combines both the classifier and the detector, and propose a novel training procedure for the detector to counteract this attack. The results show that the detector can achieve high accuracy in detecting adversarial inputs, making it a valuable tool for safety and security-critical applications.

ON DETECTING ADVERSARIAL PERTURBATIONS

21 Feb 2017 | Jan Hendrik Metzen & Tim Genewein & Volker Fischer & Bastian Bischoff