[slides] Center-based 3D Object Detection and Tracking

CenterPoint is a center-based 3D object detection and tracking framework that represents objects as points rather than 3D bounding boxes. The framework first detects object centers using a keypoint detector and regresses to other attributes such as 3D size, orientation, and velocity. A second stage refines these estimates using additional point features on the object. The tracking process simplifies to greedy closest-point matching. CenterPoint achieves state-of-the-art performance on the nuScenes benchmark for both 3D detection and tracking, with 65.5 NDS and 63.8 AMOTA for a single model. On the Waymo Open Dataset, CenterPoint outperforms all previous single model methods by a large margin and ranks first among all Lidar-only submissions. The framework is efficient, simple, and effective, with a near real-time performance of 11 FPS on Waymo and 16 FPS on nuScenes. CenterPoint uses a two-stage approach, where the first stage detects object centers and regresses to other attributes, and the second stage refines these estimates using additional point features. The framework is compatible with any 3D encoder and improves them all. CenterPoint outperforms anchor-based methods in 3D detection and tracking, demonstrating the effectiveness of the center-based representation. The framework is tested on two popular large datasets: Waymo Open Dataset and nuScenes Dataset. The results show that a simple switch from the box representation to center-based representation yields a 3-4 mAP increase in 3D detection under different backbones. Two-stage refinement further brings an additional 2 mAP boost with small computation overhead. The best single model achieves 71.8 and 66.4 level 2 mAPH for vehicle and pedestrian detection on Waymo, 58.0 mAP and 65.5 NDS on nuScenes, outperforming all published methods on both datasets. CenterPoint is adopted in 3 of the top 4 winning entries in the NeurIPS 2020 nuScenes 3D Detection challenge. For 3D tracking, the model performs at 63.8 AMOTA, outperforming the prior state-of-the-art by 8.8 AMOTA on nuScenes. On the Waymo 3D tracking benchmark, the model achieves 59.4 and 56.6 level 2 MOTA for vehicle and pedestrian tracking, respectively, surpassing previous methods by up to 50%. The end-to-end 3D detection and tracking system runs near real-time, with 11 FPS on Waymo and 16 FPS on nuScenes.CenterPoint is a center-based 3D object detection and tracking framework that represents objects as points rather than 3D bounding boxes. The framework first detects object centers using a keypoint detector and regresses to other attributes such as 3D size, orientation, and velocity. A second stage refines these estimates using additional point features on the object. The tracking process simplifies to greedy closest-point matching. CenterPoint achieves state-of-the-art performance on the nuScenes benchmark for both 3D detection and tracking, with 65.5 NDS and 63.8 AMOTA for a single model. On the Waymo Open Dataset, CenterPoint outperforms all previous single model methods by a large margin and ranks first among all Lidar-only submissions. The framework is efficient, simple, and effective, with a near real-time performance of 11 FPS on Waymo and 16 FPS on nuScenes. CenterPoint uses a two-stage approach, where the first stage detects object centers and regresses to other attributes, and the second stage refines these estimates using additional point features. The framework is compatible with any 3D encoder and improves them all. CenterPoint outperforms anchor-based methods in 3D detection and tracking, demonstrating the effectiveness of the center-based representation. The framework is tested on two popular large datasets: Waymo Open Dataset and nuScenes Dataset. The results show that a simple switch from the box representation to center-based representation yields a 3-4 mAP increase in 3D detection under different backbones. Two-stage refinement further brings an additional 2 mAP boost with small computation overhead. The best single model achieves 71.8 and 66.4 level 2 mAPH for vehicle and pedestrian detection on Waymo, 58.0 mAP and 65.5 NDS on nuScenes, outperforming all published methods on both datasets. CenterPoint is adopted in 3 of the top 4 winning entries in the NeurIPS 2020 nuScenes 3D Detection challenge. For 3D tracking, the model performs at 63.8 AMOTA, outperforming the prior state-of-the-art by 8.8 AMOTA on nuScenes. On the Waymo 3D tracking benchmark, the model achieves 59.4 and 56.6 level 2 MOTA for vehicle and pedestrian tracking, respectively, surpassing previous methods by up to 50%. The end-to-end 3D detection and tracking system runs near real-time, with 11 FPS on Waymo and 16 FPS on nuScenes.

Center-based 3D Object Detection and Tracking

6 Jan 2021 | Tianwei Yin, Xingyi Zhou, Philipp Krähenbühl