[slides and audio] Mining anomalies using traffic feature distributions

This paper presents a method for detecting and classifying network anomalies using traffic feature distributions. The authors argue that the distributions of packet features (IP addresses and ports) observed in flow traces reveal both the presence and structure of a wide range of anomalies. Using entropy as a summarization tool, they show that analyzing feature distributions leads to significant advances in anomaly detection and classification. They validate their claims on data from two backbone networks (Abilene and Geant) and conclude that feature distributions show promise as a key element of a general network anomaly diagnosis framework. The authors propose a method that leverages entropy to detect anomalies by analyzing changes in the distribution of traffic features. They show that entropy captures anomalies distinct from those captured in traffic volume (such as bytes or packets per unit time). They also demonstrate that their methods are particularly effective at detecting network-wide anomalies that span multiple flows. The authors also show that traffic feature distributions can be used to classify anomalies without incorporating detailed prior knowledge or rules. Their methods employ tools from unsupervised machine learning and benefit from the large amount of traffic data available in IP networks. The authors find that anomalies detected in Abilene and Geant naturally fall into distinct clusters, even when using simple clustering methods. Moreover, the clusters delineate anomalies according to their internal structure and are semantically meaningful. The power of this approach is shown by the discovery of new anomalies in Abilene and the successful detection and classification of external anomalies injected into the Abilene and Geant traffic. The authors conclude that their methods are practical and rely only on sampled flow data. However, their objective is not to deliver a fully automatic anomaly diagnosis system but to demonstrate the utility of new primitives and techniques that a future system could exploit to diagnose anomalies.This paper presents a method for detecting and classifying network anomalies using traffic feature distributions. The authors argue that the distributions of packet features (IP addresses and ports) observed in flow traces reveal both the presence and structure of a wide range of anomalies. Using entropy as a summarization tool, they show that analyzing feature distributions leads to significant advances in anomaly detection and classification. They validate their claims on data from two backbone networks (Abilene and Geant) and conclude that feature distributions show promise as a key element of a general network anomaly diagnosis framework. The authors propose a method that leverages entropy to detect anomalies by analyzing changes in the distribution of traffic features. They show that entropy captures anomalies distinct from those captured in traffic volume (such as bytes or packets per unit time). They also demonstrate that their methods are particularly effective at detecting network-wide anomalies that span multiple flows. The authors also show that traffic feature distributions can be used to classify anomalies without incorporating detailed prior knowledge or rules. Their methods employ tools from unsupervised machine learning and benefit from the large amount of traffic data available in IP networks. The authors find that anomalies detected in Abilene and Geant naturally fall into distinct clusters, even when using simple clustering methods. Moreover, the clusters delineate anomalies according to their internal structure and are semantically meaningful. The power of this approach is shown by the discovery of new anomalies in Abilene and the successful detection and classification of external anomalies injected into the Abilene and Geant traffic. The authors conclude that their methods are practical and rely only on sampled flow data. However, their objective is not to deliver a fully automatic anomaly diagnosis system but to demonstrate the utility of new primitives and techniques that a future system could exploit to diagnose anomalies.

Mining Anomalies Using Traffic Feature Distributions

2005-10-05 | Anukool Lakhina, Mark Crovella, and Christophe Diot