2 Jul 2024 | Ziying Song, Lei Yang, Shaoqing Xu, Lin Liu, Dongyang Xu, Caiyan Jia, Feiyang Jia, and Li Wang
GraphBEV is a robust fusion framework designed to address feature misalignment in multi-modal 3D object detection. The framework introduces two modules: LocalAlign and GlobalAlign. LocalAlign uses graph matching to incorporate neighbor-aware depth features, improving local misalignment caused by inaccurate LiDAR-to-camera projection. GlobalAlign simulates offset noise to correct global misalignment between LiDAR and camera BEV features. GraphBEV achieves state-of-the-art performance, with an mAP of 70.1% on the nuScenes validation set, surpassing BEVFusion by 1.6%. It also outperforms BEVFusion by 8.3% under noisy misalignment conditions. The framework demonstrates strong performance in both clean and noisy settings, enhancing the robustness of BEV-based methods. GraphBEV is implemented using PyTorch and leverages open-source frameworks like BEVFusion and OpenPCDet. It achieves significant improvements in small object detection and generalizes well in BEV map segmentation tasks. The framework is evaluated on the nuScenes dataset and shows robustness to various conditions, including weather and different ego distances. Overall, GraphBEV provides a more reliable and accurate solution for multi-modal 3D object detection in autonomous driving scenarios.GraphBEV is a robust fusion framework designed to address feature misalignment in multi-modal 3D object detection. The framework introduces two modules: LocalAlign and GlobalAlign. LocalAlign uses graph matching to incorporate neighbor-aware depth features, improving local misalignment caused by inaccurate LiDAR-to-camera projection. GlobalAlign simulates offset noise to correct global misalignment between LiDAR and camera BEV features. GraphBEV achieves state-of-the-art performance, with an mAP of 70.1% on the nuScenes validation set, surpassing BEVFusion by 1.6%. It also outperforms BEVFusion by 8.3% under noisy misalignment conditions. The framework demonstrates strong performance in both clean and noisy settings, enhancing the robustness of BEV-based methods. GraphBEV is implemented using PyTorch and leverages open-source frameworks like BEVFusion and OpenPCDet. It achieves significant improvements in small object detection and generalizes well in BEV map segmentation tasks. The framework is evaluated on the nuScenes dataset and shows robustness to various conditions, including weather and different ego distances. Overall, GraphBEV provides a more reliable and accurate solution for multi-modal 3D object detection in autonomous driving scenarios.