ODTrack: Online Dense Temporal Token Learning for Visual Tracking

ODTrack: Online Dense Temporal Token Learning for Visual Tracking

3 Jan 2024 | Yaozong Zheng1,2, Bineng Zhong1,2*, Qihua Liang1,2, Zhiyi Mo3, Shengping Zhang4, Xianxian Li1,2
ODTrack is a novel video-level tracking framework designed to address the limitations of traditional visual tracking methods, which often rely on sparse temporal relationships between reference and search frames. The proposed method, ODTrack, densely associates contextual relationships across video frames using an online token propagation mechanism. This approach captures the spatio-temporal trajectory of an object by compressing discriminative features into a token sequence, which serves as a prompt for future frame inference. The benefits of this method include leveraging past information to guide future inference and avoiding complex online update strategies, leading to more efficient model representation and computation. ODTrack achieves state-of-the-art performance on seven benchmarks, including LaSOT, TrackingNet, GOT10K, LaSOTExt, VOT2020, TNL2K, and OTB100, while running at real-time speed. The code and models are available at https://github.com/GXNU-ZhongLab/ODTrack.ODTrack is a novel video-level tracking framework designed to address the limitations of traditional visual tracking methods, which often rely on sparse temporal relationships between reference and search frames. The proposed method, ODTrack, densely associates contextual relationships across video frames using an online token propagation mechanism. This approach captures the spatio-temporal trajectory of an object by compressing discriminative features into a token sequence, which serves as a prompt for future frame inference. The benefits of this method include leveraging past information to guide future inference and avoiding complex online update strategies, leading to more efficient model representation and computation. ODTrack achieves state-of-the-art performance on seven benchmarks, including LaSOT, TrackingNet, GOT10K, LaSOTExt, VOT2020, TNL2K, and OTB100, while running at real-time speed. The code and models are available at https://github.com/GXNU-ZhongLab/ODTrack.
Reach us at info@study.space
[slides and audio] ODTrack%3A Online Dense Temporal Token Learning for Visual Tracking