31 Dec 2018 | Bo Li*, Wei Wu*, Qiang Wang*, Fangyi Zhang, Junliang Xing, Junjie Yan
The paper "SiamRPN++: Evolution of Siamese Visual Tracking with Very Deep Networks" by Bo Li, Wei Wu, Qiang Wang, Fangyi Zhang, Junliang Xing, and Junjie Yan addresses the limitations of traditional Siamese trackers in visual object tracking, particularly their inability to fully utilize deep network features like ResNet-50. The authors identify the core issue as the lack of strict translation invariance in these trackers, which is often violated by deep networks due to padding. To overcome this, they propose a spatial-aware sampling strategy and a new model architecture that performs layer-wise and depth-wise aggregations. This approach significantly improves the accuracy of the tracker while reducing model size. Extensive experiments on five large benchmarks (OTB2015, VOT2018, UAV123, LaSOT, and TrackingNet) demonstrate the effectiveness of the proposed SiamRPN++ tracker, achieving state-of-the-art results. The authors also provide a fast variant using MobileNet as the backbone, maintaining competitive performance at 70 FPS. The paper includes detailed theoretical analysis, experimental validation, and ablation studies to support the proposed methods.The paper "SiamRPN++: Evolution of Siamese Visual Tracking with Very Deep Networks" by Bo Li, Wei Wu, Qiang Wang, Fangyi Zhang, Junliang Xing, and Junjie Yan addresses the limitations of traditional Siamese trackers in visual object tracking, particularly their inability to fully utilize deep network features like ResNet-50. The authors identify the core issue as the lack of strict translation invariance in these trackers, which is often violated by deep networks due to padding. To overcome this, they propose a spatial-aware sampling strategy and a new model architecture that performs layer-wise and depth-wise aggregations. This approach significantly improves the accuracy of the tracker while reducing model size. Extensive experiments on five large benchmarks (OTB2015, VOT2018, UAV123, LaSOT, and TrackingNet) demonstrate the effectiveness of the proposed SiamRPN++ tracker, achieving state-of-the-art results. The authors also provide a fast variant using MobileNet as the backbone, maintaining competitive performance at 70 FPS. The paper includes detailed theoretical analysis, experimental validation, and ablation studies to support the proposed methods.