[slides and audio] Tracking Objects as Points

CenterTrack is a simple, fast, and accurate method for simultaneous object detection and tracking. It represents each object as a single point at the center of its bounding box and tracks these points through time. The method uses a detection model that takes two consecutive frames and a heatmap of prior tracklets as input. It predicts an offset vector from the current object center to its center in the previous frame, which is used for object association. CenterTrack is end-to-end trainable and differentiable, and it achieves high performance on the MOT17 and KITTI tracking benchmarks. It also extends to monocular 3D tracking, achieving 28.3% AMOTA@0.2 on the nuScenes benchmark. The method is purely local, associating objects in adjacent frames without reinitializing lost long-range tracks. It outperforms complex tracking-by-detection strategies on multiple benchmarks and is effective in both video and static image data. CenterTrack uses a heatmap-conditioned detection framework, which allows it to reason about occluded objects and improve detection accuracy. The method is trained using data augmentation to handle missing tracklets and false positives. It achieves high performance with a simple displacement prediction, making it efficient and effective for real-time tracking.CenterTrack is a simple, fast, and accurate method for simultaneous object detection and tracking. It represents each object as a single point at the center of its bounding box and tracks these points through time. The method uses a detection model that takes two consecutive frames and a heatmap of prior tracklets as input. It predicts an offset vector from the current object center to its center in the previous frame, which is used for object association. CenterTrack is end-to-end trainable and differentiable, and it achieves high performance on the MOT17 and KITTI tracking benchmarks. It also extends to monocular 3D tracking, achieving 28.3% AMOTA@0.2 on the nuScenes benchmark. The method is purely local, associating objects in adjacent frames without reinitializing lost long-range tracks. It outperforms complex tracking-by-detection strategies on multiple benchmarks and is effective in both video and static image data. CenterTrack uses a heatmap-conditioned detection framework, which allows it to reason about occluded objects and improve detection accuracy. The method is trained using data augmentation to handle missing tracklets and false positives. It achieves high performance with a simple displacement prediction, making it efficient and effective for real-time tracking.

Tracking Objects as Points

21 Aug 2020 | Xingyi Zhou, Vladlen Koltun, Philipp Krähenbühl