28 Apr 2024 | Ju Huang, Shiao Wang, Shuai Wang, Zhe Wu, Xiao Wang, and Bo Jiang
This paper proposes Mamba-FETrack, a novel RGB-Event tracking framework based on the State Space Model (SSM) to achieve high-performance tracking with reduced computational costs. The framework uses two modality-specific Mamba backbone networks to extract features from RGB frames and Event streams, and introduces a FusionMamba block to enhance interaction between the two modalities. The fused features are then fed into a tracking head for target localization. Extensive experiments on the FELT and FE108 datasets show that Mamba-FETrack achieves 43.5/55.6 on the SR/PR metric, outperforming the ViT-S based tracker (OSTrack) in both accuracy and efficiency. The GPU memory cost of Mamba-FETrack is 13.98GB, a 9.5% reduction compared to ViT-S based tracker, while the FLOPs and parameters are reduced by 94.5% and 88.3%, respectively. The proposed framework demonstrates significant improvements in tracking performance and efficiency, and the source code is available on GitHub. The work contributes to the field of tracking by introducing a new framework that effectively combines RGB and Event data using SSM-based Mamba networks.This paper proposes Mamba-FETrack, a novel RGB-Event tracking framework based on the State Space Model (SSM) to achieve high-performance tracking with reduced computational costs. The framework uses two modality-specific Mamba backbone networks to extract features from RGB frames and Event streams, and introduces a FusionMamba block to enhance interaction between the two modalities. The fused features are then fed into a tracking head for target localization. Extensive experiments on the FELT and FE108 datasets show that Mamba-FETrack achieves 43.5/55.6 on the SR/PR metric, outperforming the ViT-S based tracker (OSTrack) in both accuracy and efficiency. The GPU memory cost of Mamba-FETrack is 13.98GB, a 9.5% reduction compared to ViT-S based tracker, while the FLOPs and parameters are reduced by 94.5% and 88.3%, respectively. The proposed framework demonstrates significant improvements in tracking performance and efficiency, and the source code is available on GitHub. The work contributes to the field of tracking by introducing a new framework that effectively combines RGB and Event data using SSM-based Mamba networks.