8 Jan 2024 | Wencheng Han, Dongqian Guo, Cheng-Zhong Xu, and Jianbing Shen
This paper introduces DME-Driver, a novel autonomous driving system that integrates human decision logic with 3D scene perception. The system consists of two key components: the Decision-Maker and the Executor. The Decision-Maker is based on a Large Vision Language Model (LVLM) that learns from human driver behavior and environmental perception data to make logical decisions. The Executor translates these decisions into precise vehicle control signals. To train the system, a new dataset called HBD is developed, combining human driver behavior and environmental perception data. The system enhances the interpretability and accuracy of autonomous driving by leveraging the strengths of both LVLMs and planning-oriented perception models. The Decision-Maker simulates human driver logic, focusing on key elements in the scene and providing insights for better decision-making. The Executor ensures that these decisions are accurately translated into vehicle control signals. The system is evaluated on driving logic understanding and planning accuracy, demonstrating state-of-the-art performance. The results show that DME-Driver achieves high accuracy in autonomous driving planning and provides a level of transparency and explainability that is unprecedented in autonomous driving systems. The system's contributions include the DME-Driver autonomous driving system, the HBD dataset, the Decision-Maker model design, and the Executor model formulation. The system's effectiveness is validated through empirical evaluation, showing significant improvements in decision-making robustness and interpretability.This paper introduces DME-Driver, a novel autonomous driving system that integrates human decision logic with 3D scene perception. The system consists of two key components: the Decision-Maker and the Executor. The Decision-Maker is based on a Large Vision Language Model (LVLM) that learns from human driver behavior and environmental perception data to make logical decisions. The Executor translates these decisions into precise vehicle control signals. To train the system, a new dataset called HBD is developed, combining human driver behavior and environmental perception data. The system enhances the interpretability and accuracy of autonomous driving by leveraging the strengths of both LVLMs and planning-oriented perception models. The Decision-Maker simulates human driver logic, focusing on key elements in the scene and providing insights for better decision-making. The Executor ensures that these decisions are accurately translated into vehicle control signals. The system is evaluated on driving logic understanding and planning accuracy, demonstrating state-of-the-art performance. The results show that DME-Driver achieves high accuracy in autonomous driving planning and provides a level of transparency and explainability that is unprecedented in autonomous driving systems. The system's contributions include the DME-Driver autonomous driving system, the HBD dataset, the Decision-Maker model design, and the Executor model formulation. The system's effectiveness is validated through empirical evaluation, showing significant improvements in decision-making robustness and interpretability.