[slides] DeepXplore%3A Automated Whitebox Testing of Deep Learning Systems

DeepXplore is a whitebox testing framework designed to systematically test deep learning (DL) systems, particularly in safety- and security-critical domains such as self-driving cars and malware detection. The framework addresses the challenges of testing large-scale DL systems by introducing neuron coverage, a metric to measure the extent of system logic exercised by test inputs, and leveraging multiple similar DL systems as cross-referencing oracles to identify erroneous behaviors. DeepXplore formulates the problem of finding inputs that maximize neuron coverage and expose differential behaviors as a joint optimization problem, which is solved efficiently using gradient-based search techniques. The framework has been evaluated on five popular datasets, including ImageNet and Udacity self-driving challenge data, and has successfully identified thousands of incorrect corner case behaviors in state-of-the-art DL models. On average, DeepXplore generated one test input demonstrating incorrect behavior within one second on a commodity laptop. Additionally, the test inputs generated by DeepXplore can be used to retrain the corresponding DL models, improving classification accuracy by up to 3%. The main contributions of DeepXplore include the introduction of neuron coverage, the formulation of the joint optimization problem, and the implementation of the framework for efficient testing of large-scale DL systems.DeepXplore is a whitebox testing framework designed to systematically test deep learning (DL) systems, particularly in safety- and security-critical domains such as self-driving cars and malware detection. The framework addresses the challenges of testing large-scale DL systems by introducing neuron coverage, a metric to measure the extent of system logic exercised by test inputs, and leveraging multiple similar DL systems as cross-referencing oracles to identify erroneous behaviors. DeepXplore formulates the problem of finding inputs that maximize neuron coverage and expose differential behaviors as a joint optimization problem, which is solved efficiently using gradient-based search techniques. The framework has been evaluated on five popular datasets, including ImageNet and Udacity self-driving challenge data, and has successfully identified thousands of incorrect corner case behaviors in state-of-the-art DL models. On average, DeepXplore generated one test input demonstrating incorrect behavior within one second on a commodity laptop. Additionally, the test inputs generated by DeepXplore can be used to retrain the corresponding DL models, improving classification accuracy by up to 3%. The main contributions of DeepXplore include the introduction of neuron coverage, the formulation of the joint optimization problem, and the implementation of the framework for efficient testing of large-scale DL systems.

DeepXplore: Automated Whitebox Testing of Deep Learning Systems

2017 | Kexin Pei, Yinzhi Cao†, Junfeng Yang, Suman Jana*

DeepXplore: Automated Whitebox Testing of Deep Learning Systems

2017 | Kexin Pei*, Yinzhi Cao†, Junfeng Yang*, Suman Jana*

2017 | Kexin Pei, Yinzhi Cao†, Junfeng Yang, Suman Jana*