7 Aug 2024 | Tong Zhao¹, Lei Yang¹, Yichen Xie², Mingyu Ding², Masayoshi Tomizuka², Yintao Wei¹
RoadBEV: Road Surface Reconstruction in Bird's Eye View
This paper proposes two models, RoadBEV-mono and RoadBEV-stereo, for road elevation reconstruction in Bird's Eye View (BEV). These models estimate road elevation using monocular and stereo images, respectively. RoadBEV-mono directly fits elevation values based on voxel features queried from image view, while RoadBEV-stereo efficiently recognizes road elevation patterns based on BEV volume representing correlation between left and right voxel features. The models are validated on a real-world dataset, achieving elevation errors of 1.83 cm and 0.50 cm for RoadBEV-mono and RoadBEV-stereo, respectively. The results show that BEV-based road surface reconstruction is more accurate and reliable than traditional methods. The code is available at https://github.com/ztsrxh/RoadBEV.
The paper discusses the importance of road surface condition perception for autonomous vehicles, highlighting the challenges of traditional methods like monocular depth estimation and stereo matching. It introduces the BEV paradigm as a more efficient approach for representing multimodal and multi-view data in a uniform coordinate system. The paper also discusses the limitations of perspective view in road surface reconstruction, such as sparse depth information and global geometry hierarchy, and how BEV can overcome these issues by focusing on vertical elevation.
The paper presents the RoadBEV-mono and RoadBEV-stereo models, which use BEV features to estimate road elevation. RoadBEV-mono uses a 3D to 2D projection to query pixel features and applies 2D convolution to extract features on the reshaped BEV feature. RoadBEV-stereo uses a 4D cost volume in BEV, built from left and right voxel features, and aggregates it using 3D convolutions. The models are validated on a real-world dataset, showing their superiority over traditional methods.
The paper also discusses the limitations of BEV-based road surface reconstruction, such as the need for high-resolution feature maps and the challenges of real-time processing. It suggests that further research is needed to improve the performance of BEV-based road surface reconstruction, including the use of more advanced strategies and the integration of texture and geometry reconstruction. The paper concludes that BEV-based road surface reconstruction has significant potential for improving the safety and comfort of autonomous vehicles.RoadBEV: Road Surface Reconstruction in Bird's Eye View
This paper proposes two models, RoadBEV-mono and RoadBEV-stereo, for road elevation reconstruction in Bird's Eye View (BEV). These models estimate road elevation using monocular and stereo images, respectively. RoadBEV-mono directly fits elevation values based on voxel features queried from image view, while RoadBEV-stereo efficiently recognizes road elevation patterns based on BEV volume representing correlation between left and right voxel features. The models are validated on a real-world dataset, achieving elevation errors of 1.83 cm and 0.50 cm for RoadBEV-mono and RoadBEV-stereo, respectively. The results show that BEV-based road surface reconstruction is more accurate and reliable than traditional methods. The code is available at https://github.com/ztsrxh/RoadBEV.
The paper discusses the importance of road surface condition perception for autonomous vehicles, highlighting the challenges of traditional methods like monocular depth estimation and stereo matching. It introduces the BEV paradigm as a more efficient approach for representing multimodal and multi-view data in a uniform coordinate system. The paper also discusses the limitations of perspective view in road surface reconstruction, such as sparse depth information and global geometry hierarchy, and how BEV can overcome these issues by focusing on vertical elevation.
The paper presents the RoadBEV-mono and RoadBEV-stereo models, which use BEV features to estimate road elevation. RoadBEV-mono uses a 3D to 2D projection to query pixel features and applies 2D convolution to extract features on the reshaped BEV feature. RoadBEV-stereo uses a 4D cost volume in BEV, built from left and right voxel features, and aggregates it using 3D convolutions. The models are validated on a real-world dataset, showing their superiority over traditional methods.
The paper also discusses the limitations of BEV-based road surface reconstruction, such as the need for high-resolution feature maps and the challenges of real-time processing. It suggests that further research is needed to improve the performance of BEV-based road surface reconstruction, including the use of more advanced strategies and the integration of texture and geometry reconstruction. The paper concludes that BEV-based road surface reconstruction has significant potential for improving the safety and comfort of autonomous vehicles.