NetMamba: Efficient Network Traffic Classification via Pre-training Unidirectional Mamba

NetMamba: Efficient Network Traffic Classification via Pre-training Unidirectional Mamba

25 May 2024 | Tongze Wang, Xiaohui Xie, Wenduo Wang, Chuyi Wang, Youjian Zhao, Yong Cui
NetMamba is an efficient network traffic classification model that uses a pre-trained unidirectional Mamba architecture to achieve high accuracy and efficiency. The model addresses two main challenges in existing methods: model inefficiency due to the quadratic complexity of the Transformer architecture and inadequate traffic representation due to the loss of important byte information. NetMamba employs a comprehensive traffic representation scheme to extract valid information from massive traffic data while removing biased information. It achieves nearly 99% accuracy across six public datasets and improves inference speed by up to 60 times while maintaining low memory usage. Additionally, NetMamba demonstrates superior few-shot learning abilities, achieving better classification performance with fewer labeled data. The model is the first to tailor the Mamba architecture for networking. NetMamba uses a stride-based representation scheme to process network traffic, incorporating positional embeddings and pre-training strategies. It undergoes self-supervised pre-training on large unlabeled datasets using a masked autoencoder structure and is fine-tuned on limited labeled data to refine traffic representations. The model's architecture includes an encoder and decoder composed of unidirectional Mamba blocks, with the encoder processing sequence information from front to back. The decoder is replaced with a multi-layer perceptron head for classification tasks. NetMamba's performance is evaluated on six public datasets, demonstrating superior accuracy, efficiency, and robustness compared to existing methods. The model's efficiency is further validated through inference speed and GPU memory consumption comparisons, showing significant improvements over other deep learning methods. An ablation study confirms the effectiveness of NetMamba's design, with results indicating that the model's components contribute to its performance. The model also shows strong few-shot learning capabilities, outperforming other pre-trained models with limited labeled data. NetMamba's comprehensive representation scheme and refined model design enable it to address broader tasks within the network domain, such as quality of service prediction and network performance prediction. However, the current implementation of NetMamba depends on specialized GPU hardware, which limits its deployment on real-world network devices. Future work aims to explore solutions to implement NetMamba on resource-constrained devices.NetMamba is an efficient network traffic classification model that uses a pre-trained unidirectional Mamba architecture to achieve high accuracy and efficiency. The model addresses two main challenges in existing methods: model inefficiency due to the quadratic complexity of the Transformer architecture and inadequate traffic representation due to the loss of important byte information. NetMamba employs a comprehensive traffic representation scheme to extract valid information from massive traffic data while removing biased information. It achieves nearly 99% accuracy across six public datasets and improves inference speed by up to 60 times while maintaining low memory usage. Additionally, NetMamba demonstrates superior few-shot learning abilities, achieving better classification performance with fewer labeled data. The model is the first to tailor the Mamba architecture for networking. NetMamba uses a stride-based representation scheme to process network traffic, incorporating positional embeddings and pre-training strategies. It undergoes self-supervised pre-training on large unlabeled datasets using a masked autoencoder structure and is fine-tuned on limited labeled data to refine traffic representations. The model's architecture includes an encoder and decoder composed of unidirectional Mamba blocks, with the encoder processing sequence information from front to back. The decoder is replaced with a multi-layer perceptron head for classification tasks. NetMamba's performance is evaluated on six public datasets, demonstrating superior accuracy, efficiency, and robustness compared to existing methods. The model's efficiency is further validated through inference speed and GPU memory consumption comparisons, showing significant improvements over other deep learning methods. An ablation study confirms the effectiveness of NetMamba's design, with results indicating that the model's components contribute to its performance. The model also shows strong few-shot learning capabilities, outperforming other pre-trained models with limited labeled data. NetMamba's comprehensive representation scheme and refined model design enable it to address broader tasks within the network domain, such as quality of service prediction and network performance prediction. However, the current implementation of NetMamba depends on specialized GPU hardware, which limits its deployment on real-world network devices. Future work aims to explore solutions to implement NetMamba on resource-constrained devices.
Reach us at info@study.space