9 Apr 2021 | Shaoshuai Shi, Chaoxu Guo, Li Jiang, Zhe Wang, Jianping Shi, Xiaogang Wang, Hongsheng Li
The paper introduces PV-RCNN (PointVoxel-RCNN), a novel framework for 3D object detection from point clouds. It integrates both 3D voxel Convolutional Neural Network (CNN) and PointNet-based set abstraction to learn discriminative point cloud features. The method first uses a 3D voxel CNN to encode the scene into a small set of keypoints, leveraging efficient learning and high-quality proposals. These keypoints are then used to abstract proposal-specific features via RoI-grid pooling, which captures richer context information for accurate object localization and confidence estimation. Extensive experiments on the KITTI and Waymo Open datasets show that PV-RCNN outperforms state-of-the-art methods with significant margins. The contributions of the paper include the integration of voxel-based and point-based methods, the voxel-to-keypoint scene encoding scheme, the multi-scale RoI feature abstraction layer, and the overall superior performance on challenging datasets.The paper introduces PV-RCNN (PointVoxel-RCNN), a novel framework for 3D object detection from point clouds. It integrates both 3D voxel Convolutional Neural Network (CNN) and PointNet-based set abstraction to learn discriminative point cloud features. The method first uses a 3D voxel CNN to encode the scene into a small set of keypoints, leveraging efficient learning and high-quality proposals. These keypoints are then used to abstract proposal-specific features via RoI-grid pooling, which captures richer context information for accurate object localization and confidence estimation. Extensive experiments on the KITTI and Waymo Open datasets show that PV-RCNN outperforms state-of-the-art methods with significant margins. The contributions of the paper include the integration of voxel-based and point-based methods, the voxel-to-keypoint scene encoding scheme, the multi-scale RoI feature abstraction layer, and the overall superior performance on challenging datasets.