5 May 2017 | Xiaowei Huang, Marta Kwiatkowska, Sen Wang and Min Wu
Deep neural networks have achieved impressive results in image classification, but are vulnerable to adversarial perturbations—small changes to the input image that can cause misclassification. This raises safety concerns, especially for applications like self-driving cars. The authors develop an automated verification framework for feed-forward multi-layer neural networks using Satisfiability Modulo Theory (SMT). They focus on ensuring that the network's classification decisions remain invariant under small image manipulations, such as scratches or changes in camera angle or lighting conditions. The framework enables exhaustive search of the image region by discretization and propagates the analysis layer by layer. It works directly with the network code and guarantees the discovery of adversarial examples if they exist. The techniques are implemented using Z3 and evaluated on state-of-the-art networks, including regularized and deep learning networks. The paper also compares the verification techniques against existing methods for finding adversarial examples and estimating network robustness.Deep neural networks have achieved impressive results in image classification, but are vulnerable to adversarial perturbations—small changes to the input image that can cause misclassification. This raises safety concerns, especially for applications like self-driving cars. The authors develop an automated verification framework for feed-forward multi-layer neural networks using Satisfiability Modulo Theory (SMT). They focus on ensuring that the network's classification decisions remain invariant under small image manipulations, such as scratches or changes in camera angle or lighting conditions. The framework enables exhaustive search of the image region by discretization and propagates the analysis layer by layer. It works directly with the network code and guarantees the discovery of adversarial examples if they exist. The techniques are implemented using Z3 and evaluated on state-of-the-art networks, including regularized and deep learning networks. The paper also compares the verification techniques against existing methods for finding adversarial examples and estimating network robustness.