22 Jun 2017 | Xiaozhi Chen1, Huimin Ma1, Ji Wan2, Bo Li2, Tian Xia2
This paper presents a Multi-View 3D Object Detection Network (MV3D) for autonomous driving, which aims to achieve high-accuracy 3D object detection by integrating LIDAR point cloud and RGB images. The MV3D network consists of two subnetworks: a 3D Proposal Network and a Region-based Fusion Network. The 3D Proposal Network generates 3D candidate boxes from the bird's eye view representation of the LIDAR point cloud, while the Region-based Fusion Network combines region-wise features from multiple views to enable interactions between intermediate layers. The deep fusion approach, inspired by FractalNet and Deeply-Fused Net, hierarchically fuses multi-view features. The network is trained using drop-path and auxiliary losses to regularize the fusion process. Experiments on the KITTI benchmark show that MV3D outperforms state-of-the-art methods by around 25% and 30% in 3D localization and 3D detection, respectively, and achieves a 10.3% higher AP in 2D detection compared to LIDAR-based methods.This paper presents a Multi-View 3D Object Detection Network (MV3D) for autonomous driving, which aims to achieve high-accuracy 3D object detection by integrating LIDAR point cloud and RGB images. The MV3D network consists of two subnetworks: a 3D Proposal Network and a Region-based Fusion Network. The 3D Proposal Network generates 3D candidate boxes from the bird's eye view representation of the LIDAR point cloud, while the Region-based Fusion Network combines region-wise features from multiple views to enable interactions between intermediate layers. The deep fusion approach, inspired by FractalNet and Deeply-Fused Net, hierarchically fuses multi-view features. The network is trained using drop-path and auxiliary losses to regularize the fusion process. Experiments on the KITTI benchmark show that MV3D outperforms state-of-the-art methods by around 25% and 30% in 3D localization and 3D detection, respectively, and achieves a 10.3% higher AP in 2D detection compared to LIDAR-based methods.