Variational Autoencoder based Anomaly Detection using Reconstruction Probability

Variational Autoencoder based Anomaly Detection using Reconstruction Probability

December 27, 2015 | Jinwon An, Sungzoon Cho
This paper proposes an anomaly detection method using the reconstruction probability from a variational autoencoder (VAE). Unlike traditional autoencoder and principal components analysis (PCA) based methods that use reconstruction error as an anomaly score, the VAE-based method uses a probabilistic measure called reconstruction probability, which accounts for the variability of the data distribution. This makes the reconstruction probability a more principled and objective anomaly score. The method leverages the generative capabilities of the VAE to reconstruct data and analyze the underlying causes of anomalies. The paper first reviews various anomaly detection methods, including statistical, proximity-based, and deviation-based approaches. It then introduces autoencoders and their use in anomaly detection, followed by a detailed explanation of VAEs, which are probabilistic graphical models that combine variational inference with deep learning. The VAE's objective function is the variational lower bound of the marginal likelihood of data, which is intractable. The VAE models the parameters of the approximate posterior distribution of the latent variables, rather than the values themselves, allowing for probabilistic inference. The proposed method uses the reconstruction probability, calculated as the expected log probability of the data under the approximate posterior distribution, to detect anomalies. This probability is a Monte Carlo estimate of the second term in the VAE's objective function. The method is evaluated on two datasets: the MNIST dataset and the KDD cup 1999 network intrusion dataset. The results show that the VAE-based method outperforms autoencoder and PCA-based methods in terms of AUC ROC and AUC PRC metrics. The method is also able to capture the underlying structure of the data, making it more effective at detecting anomalies. The paper concludes that the VAE-based anomaly detection method is more objective and principled than traditional methods, and that its generative capabilities allow for a deeper understanding of the underlying causes of anomalies. The method is particularly effective for data with complex structures, such as the MNIST dataset, where certain digits are difficult to reconstruct. The results also show that the method performs well on the KDD dataset, especially when trained with a diverse set of data. The paper highlights the importance of using probabilistic models in anomaly detection, as they provide a more principled and objective way to assess the likelihood of data points being anomalies.This paper proposes an anomaly detection method using the reconstruction probability from a variational autoencoder (VAE). Unlike traditional autoencoder and principal components analysis (PCA) based methods that use reconstruction error as an anomaly score, the VAE-based method uses a probabilistic measure called reconstruction probability, which accounts for the variability of the data distribution. This makes the reconstruction probability a more principled and objective anomaly score. The method leverages the generative capabilities of the VAE to reconstruct data and analyze the underlying causes of anomalies. The paper first reviews various anomaly detection methods, including statistical, proximity-based, and deviation-based approaches. It then introduces autoencoders and their use in anomaly detection, followed by a detailed explanation of VAEs, which are probabilistic graphical models that combine variational inference with deep learning. The VAE's objective function is the variational lower bound of the marginal likelihood of data, which is intractable. The VAE models the parameters of the approximate posterior distribution of the latent variables, rather than the values themselves, allowing for probabilistic inference. The proposed method uses the reconstruction probability, calculated as the expected log probability of the data under the approximate posterior distribution, to detect anomalies. This probability is a Monte Carlo estimate of the second term in the VAE's objective function. The method is evaluated on two datasets: the MNIST dataset and the KDD cup 1999 network intrusion dataset. The results show that the VAE-based method outperforms autoencoder and PCA-based methods in terms of AUC ROC and AUC PRC metrics. The method is also able to capture the underlying structure of the data, making it more effective at detecting anomalies. The paper concludes that the VAE-based anomaly detection method is more objective and principled than traditional methods, and that its generative capabilities allow for a deeper understanding of the underlying causes of anomalies. The method is particularly effective for data with complex structures, such as the MNIST dataset, where certain digits are difficult to reconstruct. The results also show that the method performs well on the KDD dataset, especially when trained with a diverse set of data. The paper highlights the importance of using probabilistic models in anomaly detection, as they provide a more principled and objective way to assess the likelihood of data points being anomalies.
Reach us at info@futurestudyspace.com
[slides] Variational Autoencoder based Anomaly Detection using Reconstruction Probability | StudySpace