PointPillars: Fast Encoders for Object Detection from Point Clouds

PointPillars: Fast Encoders for Object Detection from Point Clouds

7 May 2019 | Alex H. Lang, Sourabh Vora, Holger Caesar, Lubing Zhou, Jiong Yang, Oscar Beijbom
PointPillars is a novel encoder for object detection in point clouds that enables end-to-end learning using only 2D convolutional layers. It processes point clouds by organizing them into vertical columns (pillars) and learns features on these pillars to predict 3D oriented boxes for objects. The method outperforms previous encoders in both speed and accuracy, achieving 62 Hz inference speed, a 2-4 fold improvement over existing methods. A faster version of the method runs at 105 Hz, matching state-of-the-art performance. PointPillars is evaluated on the KITTI benchmark, where it significantly outperforms both lidar-only and fusion methods in 3D and bird's eye view (BEV) detection. The method is efficient, as all key operations can be formulated as 2D convolutions, which are computationally efficient. It also requires no hand-tuning for different point cloud configurations, making it adaptable to various sensor inputs. The network is trained using only lidar data and achieves state-of-the-art results on the KITTI dataset, demonstrating its effectiveness in real-time object detection. PointPillars is a lightweight and efficient approach that enables high-speed 3D object detection from point clouds.PointPillars is a novel encoder for object detection in point clouds that enables end-to-end learning using only 2D convolutional layers. It processes point clouds by organizing them into vertical columns (pillars) and learns features on these pillars to predict 3D oriented boxes for objects. The method outperforms previous encoders in both speed and accuracy, achieving 62 Hz inference speed, a 2-4 fold improvement over existing methods. A faster version of the method runs at 105 Hz, matching state-of-the-art performance. PointPillars is evaluated on the KITTI benchmark, where it significantly outperforms both lidar-only and fusion methods in 3D and bird's eye view (BEV) detection. The method is efficient, as all key operations can be formulated as 2D convolutions, which are computationally efficient. It also requires no hand-tuning for different point cloud configurations, making it adaptable to various sensor inputs. The network is trained using only lidar data and achieves state-of-the-art results on the KITTI dataset, demonstrating its effectiveness in real-time object detection. PointPillars is a lightweight and efficient approach that enables high-speed 3D object detection from point clouds.
Reach us at info@study.space
[slides and audio] PointPillars%3A Fast Encoders for Object Detection From Point Clouds