LION: Linear Group RNN for 3D Object Detection in Point Clouds

LION: Linear Group RNN for 3D Object Detection in Point Clouds

25 Jul 2024 | Zhe Liu, Jinghua Hou, Xinyu Wang, Xiaoqing Ye, Jingdong Wang, Hengshuang Zhao, Xiang Bai
LION is a window-based framework for 3D object detection in point clouds, leveraging linear group RNNs to achieve efficient long-range feature interaction. The framework, named LION, is designed to handle the challenges of sparse point clouds by using a 3D spatial feature descriptor and a voxel generation strategy. It supports various linear RNN operators, including Mamba, RWKV, and RetNet, and achieves state-of-the-art performance on datasets such as Waymo, nuScenes, Argoverse V2, and ONCE. The key components of LION include a 3D backbone that uses linear group RNNs for long-range feature interaction, a 3D spatial feature descriptor to capture local spatial information, and a voxel generation strategy to enhance feature representation in sparse point clouds. The framework is evaluated on multiple datasets and demonstrates superior performance compared to existing methods, particularly in terms of detection accuracy and efficiency. The results show that LION outperforms previous state-of-the-art methods in various benchmark datasets, highlighting its effectiveness and generalization capability. The method is also efficient and scalable, making it suitable for real-world applications in autonomous driving and robotics.LION is a window-based framework for 3D object detection in point clouds, leveraging linear group RNNs to achieve efficient long-range feature interaction. The framework, named LION, is designed to handle the challenges of sparse point clouds by using a 3D spatial feature descriptor and a voxel generation strategy. It supports various linear RNN operators, including Mamba, RWKV, and RetNet, and achieves state-of-the-art performance on datasets such as Waymo, nuScenes, Argoverse V2, and ONCE. The key components of LION include a 3D backbone that uses linear group RNNs for long-range feature interaction, a 3D spatial feature descriptor to capture local spatial information, and a voxel generation strategy to enhance feature representation in sparse point clouds. The framework is evaluated on multiple datasets and demonstrates superior performance compared to existing methods, particularly in terms of detection accuracy and efficiency. The results show that LION outperforms previous state-of-the-art methods in various benchmark datasets, highlighting its effectiveness and generalization capability. The method is also efficient and scalable, making it suitable for real-world applications in autonomous driving and robotics.
Reach us at info@study.space
[slides and audio] LION%3A Linear Group RNN for 3D Object Detection in Point Clouds