2024 | Zihao Liu, Xiaoyu Zhang, Guangwei Liu, Ji Zhao, and Ningyi Xu
This paper presents MapQR, an end-to-end method for online vectorized map construction. The method introduces a novel scatter-and-gather query mechanism to enhance the query capabilities for constructing vectorized maps. The scatter-and-gather query is designed to efficiently probe information from bird's-eye-view (BEV) features by scattering queries to different reference points and gathering them back to enhance information within each map instance. The method also incorporates positional embeddings to improve the learning process. The proposed MapQR achieves the best mean average precision (mAP) on both nuScenes and Argoverse 2 datasets while maintaining good efficiency. Additionally, integrating the query design into other models significantly boosts their performance. The method is based on a DETR-like architecture and uses instance queries instead of point queries, allowing for more accurate prediction without significantly increasing computational burden. The BEV encoder is improved with a flexible height design to enhance performance. The method is evaluated on public datasets and shows superior performance compared to state-of-the-art methods. The results demonstrate that the proposed method outperforms existing methods in terms of accuracy and efficiency. The paper also includes ablation studies to validate the effectiveness of the proposed design.This paper presents MapQR, an end-to-end method for online vectorized map construction. The method introduces a novel scatter-and-gather query mechanism to enhance the query capabilities for constructing vectorized maps. The scatter-and-gather query is designed to efficiently probe information from bird's-eye-view (BEV) features by scattering queries to different reference points and gathering them back to enhance information within each map instance. The method also incorporates positional embeddings to improve the learning process. The proposed MapQR achieves the best mean average precision (mAP) on both nuScenes and Argoverse 2 datasets while maintaining good efficiency. Additionally, integrating the query design into other models significantly boosts their performance. The method is based on a DETR-like architecture and uses instance queries instead of point queries, allowing for more accurate prediction without significantly increasing computational burden. The BEV encoder is improved with a flexible height design to enhance performance. The method is evaluated on public datasets and shows superior performance compared to state-of-the-art methods. The results demonstrate that the proposed method outperforms existing methods in terms of accuracy and efficiency. The paper also includes ablation studies to validate the effectiveness of the proposed design.