20 Sep 2016 | Martin Danelljan, Student Member, IEEE, Gustav Häger, Student Member, IEEE, Fahad Shahbaz Khan, Member, IEEE, and Michael Felsberg, Senior Member, IEEE
This paper addresses the challenging problem of accurate and robust scale estimation in visual object tracking. The authors propose a novel discriminative scale adaptive tracking approach, called Discriminative Scale Space Tracker (DSST), which learns separate discriminative correlation filters for translation and scale estimation. The scale filter is learned online using target appearance samples at different scales, directly learning the appearance changes induced by scale variations. The method is computationally efficient, achieving a 50% higher frame rate compared to exhaustive scale search methods. Extensive experiments on the OTB and VOT2014 datasets demonstrate that the proposed method outperforms 19 state-of-the-art trackers on the OTB dataset and 37 on the VOT2014 dataset, achieving a 2.5% gain in average overlap precision. The fast version of the DSST (fDSST) further improves performance by up to 7.0% in mean overlap precision and 4.4% in mean distance precision while operating at twice the speed of the DSST. The method's effectiveness is validated through comprehensive evaluations, showing superior performance and real-time efficiency.This paper addresses the challenging problem of accurate and robust scale estimation in visual object tracking. The authors propose a novel discriminative scale adaptive tracking approach, called Discriminative Scale Space Tracker (DSST), which learns separate discriminative correlation filters for translation and scale estimation. The scale filter is learned online using target appearance samples at different scales, directly learning the appearance changes induced by scale variations. The method is computationally efficient, achieving a 50% higher frame rate compared to exhaustive scale search methods. Extensive experiments on the OTB and VOT2014 datasets demonstrate that the proposed method outperforms 19 state-of-the-art trackers on the OTB dataset and 37 on the VOT2014 dataset, achieving a 2.5% gain in average overlap precision. The fast version of the DSST (fDSST) further improves performance by up to 7.0% in mean overlap precision and 4.4% in mean distance precision while operating at twice the speed of the DSST. The method's effectiveness is validated through comprehensive evaluations, showing superior performance and real-time efficiency.