[slides] A Simple Unified Framework for Detecting Out-of-Distribution Samples and Adversarial Attacks

This paper proposes a simple yet effective method for detecting abnormal test samples, including both out-of-distribution (OOD) and adversarial samples, using pre-trained softmax neural classifiers. The method leverages Gaussian discriminant analysis (GDA) to obtain class-conditional Gaussian distributions for low- and upper-level features of deep models, and defines a confidence score based on the Mahalanobis distance. This approach outperforms existing methods in detecting both OOD and adversarial samples, and is more robust in scenarios with noisy labels or small training datasets. The method is also applied to class-incremental learning, where it can incorporate new classes without re-training deep models. Experimental results on various datasets, including CIFAR, SVHN, ImageNet, and LSUN, demonstrate the effectiveness of the proposed method.This paper proposes a simple yet effective method for detecting abnormal test samples, including both out-of-distribution (OOD) and adversarial samples, using pre-trained softmax neural classifiers. The method leverages Gaussian discriminant analysis (GDA) to obtain class-conditional Gaussian distributions for low- and upper-level features of deep models, and defines a confidence score based on the Mahalanobis distance. This approach outperforms existing methods in detecting both OOD and adversarial samples, and is more robust in scenarios with noisy labels or small training datasets. The method is also applied to class-incremental learning, where it can incorporate new classes without re-training deep models. Experimental results on various datasets, including CIFAR, SVHN, ImageNet, and LSUN, demonstrate the effectiveness of the proposed method.

A Simple Unified Framework for Detecting Out-of-Distribution Samples and Adversarial Attacks

27 Oct 2018 | Kimin Lee, Kibok Lee, Honglak Lee, Jinwoo Shin