Understanding Variational Autoencoder based Anomaly Detection using Reconstruction Probability

The paper proposes an anomaly detection method using the reconstruction probability from a variational autoencoder (VAE). The reconstruction probability is a probabilistic measure that accounts for the variability of the data distribution, making it more principled and objective compared to the reconstruction error used in autoencoder and principal components (PCA) based methods. The VAE's generative characteristics allow for the derivation of data reconstructions, which can help analyze the underlying causes of anomalies. The VAE is a probabilistic graphical model that combines variational inference with deep learning, providing a theoretical foundation for anomaly detection. The method uses a semi-supervised framework, training the VAE with only normal instances and testing it with both normal and anomalous data. The reconstruction probability is calculated by drawing stochastic samples from the latent variable distribution and using these samples to estimate the probability of the original data generating from the distribution. Experimental results on the MNIST and KDD cup 1999 network intrusion datasets show that the proposed method outperforms autoencoder and PCA based methods. The VAE-based method is particularly effective in handling anomalies with high variability, such as digits 1, 7, and 9 in the MNIST dataset, and anomaly classes R2L, U2R, and Probe in the KDD dataset. The method's ability to handle heterogeneous data and its objective scoring system make it a robust and principled approach to anomaly detection.The paper proposes an anomaly detection method using the reconstruction probability from a variational autoencoder (VAE). The reconstruction probability is a probabilistic measure that accounts for the variability of the data distribution, making it more principled and objective compared to the reconstruction error used in autoencoder and principal components (PCA) based methods. The VAE's generative characteristics allow for the derivation of data reconstructions, which can help analyze the underlying causes of anomalies. The VAE is a probabilistic graphical model that combines variational inference with deep learning, providing a theoretical foundation for anomaly detection. The method uses a semi-supervised framework, training the VAE with only normal instances and testing it with both normal and anomalous data. The reconstruction probability is calculated by drawing stochastic samples from the latent variable distribution and using these samples to estimate the probability of the original data generating from the distribution. Experimental results on the MNIST and KDD cup 1999 network intrusion datasets show that the proposed method outperforms autoencoder and PCA based methods. The VAE-based method is particularly effective in handling anomalies with high variability, such as digits 1, 7, and 9 in the MNIST dataset, and anomaly classes R2L, U2R, and Probe in the KDD dataset. The method's ability to handle heterogeneous data and its objective scoring system make it a robust and principled approach to anomaly detection.

Variational Autoencoder based Anomaly Detection using Reconstruction Probability

December 27, 2015 | Jinwon An Sungzoon Cho