DPFT: Dual Perspective Fusion Transformer for Camera-Radar-based Object Detection

DPFT: Dual Perspective Fusion Transformer for Camera-Radar-based Object Detection

2024 | Felix Fent, Andras Palfy, Holger Caesar
The paper introduces a novel camera-radar fusion method called Dual Perspective Fusion Transformer (DPFT) for 3D object detection in autonomous driving. DPFT addresses the limitations of traditional camera and radar fusion methods by leveraging raw 4D radar cube data and projecting it onto two perspectives: the range-azimuth (RA) and azimuth-elevation (AE) planes. This approach preserves more information and simplifies the fusion process, making it more robust against severe weather conditions and sensor failures. The method uses a modified deformable attention mechanism to query features from these perspectives and combines them effectively. Experimental results on the K-Radar dataset show that DPFT achieves state-of-the-art performance, with a mean average precision (mAP) of 56.1% at an intersection over union (IoU) threshold of 0.3. The model also demonstrates robustness in adverse weather conditions and maintains low inference times, making it suitable for real-world applications. The code for DPFT is available as open-source software.The paper introduces a novel camera-radar fusion method called Dual Perspective Fusion Transformer (DPFT) for 3D object detection in autonomous driving. DPFT addresses the limitations of traditional camera and radar fusion methods by leveraging raw 4D radar cube data and projecting it onto two perspectives: the range-azimuth (RA) and azimuth-elevation (AE) planes. This approach preserves more information and simplifies the fusion process, making it more robust against severe weather conditions and sensor failures. The method uses a modified deformable attention mechanism to query features from these perspectives and combines them effectively. Experimental results on the K-Radar dataset show that DPFT achieves state-of-the-art performance, with a mean average precision (mAP) of 56.1% at an intersection over union (IoU) threshold of 0.3. The model also demonstrates robustness in adverse weather conditions and maintains low inference times, making it suitable for real-world applications. The code for DPFT is available as open-source software.
Reach us at info@study.space
Understanding DPFT%3A Dual Perspective Fusion Transformer for Camera-Radar-based Object Detection