A Survey for Foundation Models in Autonomous Driving

A Survey for Foundation Models in Autonomous Driving

21 Aug 2024 | Haoxiang Guo, Zhongruo Wang, Yaqian Li, Kaiwen Long, Ming Yang, Yiqing Shen
This survey provides a comprehensive review of the application of foundation models in autonomous driving (AD), focusing on large language models (LLMs), vision foundation models, and multi-modal foundation models. LLMs are leveraged for planning, simulation, reasoning, and code generation, while vision foundation models enhance 3D object detection, tracking, and realistic driving scenario creation. Multi-modal foundation models integrate diverse inputs for improved visual understanding and spatial reasoning. The survey categorizes these models based on their modalities and functionalities within AD, discusses current methods, and identifies gaps and future research directions. Key challenges include hallucination, latency, and the need for domain-specific datasets. The authors propose a roadmap for advancing foundation models in AD, emphasizing the importance of domain-specific pre-training, reinforcement learning, and human-in-the-loop alignment.This survey provides a comprehensive review of the application of foundation models in autonomous driving (AD), focusing on large language models (LLMs), vision foundation models, and multi-modal foundation models. LLMs are leveraged for planning, simulation, reasoning, and code generation, while vision foundation models enhance 3D object detection, tracking, and realistic driving scenario creation. Multi-modal foundation models integrate diverse inputs for improved visual understanding and spatial reasoning. The survey categorizes these models based on their modalities and functionalities within AD, discusses current methods, and identifies gaps and future research directions. Key challenges include hallucination, latency, and the need for domain-specific datasets. The authors propose a roadmap for advancing foundation models in AD, emphasizing the importance of domain-specific pre-training, reinforcement learning, and human-in-the-loop alignment.
Reach us at info@study.space