2 Jan 2024 | Hongyu Wang, Xiaotao Liu*, Yifan Li, Meng Sun, Dian Yuan, Jing Liu
The paper introduces a novel Temporal Adaptive RGBT Tracking (TATrack) framework, which aims to enhance the performance of RGBT tracking by effectively utilizing both spatial and temporal information. Traditional RGBT trackers often struggle with object state changes and lack comprehensive exploitation of temporal information. TATrack addresses these issues by incorporating a spatio-temporal two-stream structure and an online updated template. The two-stream structure includes multi-modal feature extraction and cross-modal interaction for the initial and online templates, respectively. The paper also introduces a Spatio-Temporal Interaction (STI) mechanism to enable cross-modal interaction over longer time scales, enhancing the discriminative power of the model. Extensive experiments on three popular RGBT tracking benchmarks (RGBT210, RGBT234, LasHeR) demonstrate that TATrack achieves state-of-the-art performance while maintaining real-time speed. The method's effectiveness is further validated through ablation studies and attribute-based performance evaluations, showing superior performance in various challenging scenarios.The paper introduces a novel Temporal Adaptive RGBT Tracking (TATrack) framework, which aims to enhance the performance of RGBT tracking by effectively utilizing both spatial and temporal information. Traditional RGBT trackers often struggle with object state changes and lack comprehensive exploitation of temporal information. TATrack addresses these issues by incorporating a spatio-temporal two-stream structure and an online updated template. The two-stream structure includes multi-modal feature extraction and cross-modal interaction for the initial and online templates, respectively. The paper also introduces a Spatio-Temporal Interaction (STI) mechanism to enable cross-modal interaction over longer time scales, enhancing the discriminative power of the model. Extensive experiments on three popular RGBT tracking benchmarks (RGBT210, RGBT234, LasHeR) demonstrate that TATrack achieves state-of-the-art performance while maintaining real-time speed. The method's effectiveness is further validated through ablation studies and attribute-based performance evaluations, showing superior performance in various challenging scenarios.