LimSim++: A Closed-Loop Platform for Deploying Multimodal LLMs in Autonomous Driving

LimSim++: A Closed-Loop Platform for Deploying Multimodal LLMs in Autonomous Driving

24 Apr 2024 | Daocheng Fu, Wenjie Lei, Licheng Wen, Pinlong Cai, Song Mao, Min Dou, Botian Shi, Yu Qiao
LimSim++ is a closed-loop platform designed for deploying multimodal large language models (M)LLMs in autonomous driving. It addresses the limitations of existing simulation platforms by providing extended-duration, multi-scenario simulations that support continuous learning and improved generalization in autonomous driving. The platform allows users to engage in prompt engineering, model evaluation, and framework enhancement, making it a versatile tool for research and practice. LimSim++ introduces a baseline (M)LLM-driven framework, systematically validated through quantitative experiments across diverse scenarios. The platform is open-source and available at https://pjlab-adg.github.io/limsim-plus/. The platform includes a closed-loop system with road topology, dynamic traffic flow, navigation, traffic control, and other essential information. Prompts serve as the basis for the agent system supported by (M)LLMs, incorporating real-time scenario information presented through images or textual descriptions. The (M)LLM-supported agent system exhibits capabilities such as information processing, tool usage, strategy formulation, and self-assessment. LimSim++ offers various types and modalities of prompt inputs to meet the needs of diverse (M)LLMs for completing driving tasks. At each decision frame, LimSim++ extracts the road network and vehicle information around the ego vehicle. This information is then packaged and passed to the driver agent in natural language. LimSim++ has equipped the ego vehicle with six cameras based on the camera settings of nuScenes. These cameras are capable of capturing panoramic images of the ego vehicle. The scenario description of LimSim++ is modular, providing real-time status, navigation information, and task descriptions. Users can freely combine scenario information based on the needs of the driver agent and package it into suitable prompts. LimSim++ serves as a closed-loop simulation evaluation platform, supporting the inference and decision-making processes of the dependent (M)LLMs, which can be done through zero-shot or few-shot driving approaches. Zero-shot driving involves making judgments directly based on obtained prompts; however, the hallucinatory issues of (M)LLMs can lead to decision failures. For non-specialized (M)LLMs, few-shot learning proves pivotal, allowing these models to acquire sensible solutions for driving tasks through exposure to a limited set of instances, especially when diverse scenarios necessitate distinct reactions. LimSim++ can handle various control signals derived from decision-making outcomes. If the driver agent provides only behavioral primitives, such as acceleration, deceleration, left turn, right turn, etc., LimSim++ offers control interfaces for these primitives, facilitating the conversion of the driver agent's decisions into vehicle trajectories. Furthermore, LimSim++ supports direct utilization of trajectories output by the driver agent to control vehicle motion, albeit with increased performance requirements on (M)LLMs. The evaluation module quantifies and assesses the vehicle behavior decisions made by the driver agent based on the analysis of vehicle trajectories. This process is an essential component within the continuous learning framework. The periodLimSim++ is a closed-loop platform designed for deploying multimodal large language models (M)LLMs in autonomous driving. It addresses the limitations of existing simulation platforms by providing extended-duration, multi-scenario simulations that support continuous learning and improved generalization in autonomous driving. The platform allows users to engage in prompt engineering, model evaluation, and framework enhancement, making it a versatile tool for research and practice. LimSim++ introduces a baseline (M)LLM-driven framework, systematically validated through quantitative experiments across diverse scenarios. The platform is open-source and available at https://pjlab-adg.github.io/limsim-plus/. The platform includes a closed-loop system with road topology, dynamic traffic flow, navigation, traffic control, and other essential information. Prompts serve as the basis for the agent system supported by (M)LLMs, incorporating real-time scenario information presented through images or textual descriptions. The (M)LLM-supported agent system exhibits capabilities such as information processing, tool usage, strategy formulation, and self-assessment. LimSim++ offers various types and modalities of prompt inputs to meet the needs of diverse (M)LLMs for completing driving tasks. At each decision frame, LimSim++ extracts the road network and vehicle information around the ego vehicle. This information is then packaged and passed to the driver agent in natural language. LimSim++ has equipped the ego vehicle with six cameras based on the camera settings of nuScenes. These cameras are capable of capturing panoramic images of the ego vehicle. The scenario description of LimSim++ is modular, providing real-time status, navigation information, and task descriptions. Users can freely combine scenario information based on the needs of the driver agent and package it into suitable prompts. LimSim++ serves as a closed-loop simulation evaluation platform, supporting the inference and decision-making processes of the dependent (M)LLMs, which can be done through zero-shot or few-shot driving approaches. Zero-shot driving involves making judgments directly based on obtained prompts; however, the hallucinatory issues of (M)LLMs can lead to decision failures. For non-specialized (M)LLMs, few-shot learning proves pivotal, allowing these models to acquire sensible solutions for driving tasks through exposure to a limited set of instances, especially when diverse scenarios necessitate distinct reactions. LimSim++ can handle various control signals derived from decision-making outcomes. If the driver agent provides only behavioral primitives, such as acceleration, deceleration, left turn, right turn, etc., LimSim++ offers control interfaces for these primitives, facilitating the conversion of the driver agent's decisions into vehicle trajectories. Furthermore, LimSim++ supports direct utilization of trajectories output by the driver agent to control vehicle motion, albeit with increased performance requirements on (M)LLMs. The evaluation module quantifies and assesses the vehicle behavior decisions made by the driver agent based on the analysis of vehicle trajectories. This process is an essential component within the continuous learning framework. The period
Reach us at info@study.space
[slides and audio] LimSim%2B%2B%3A A Closed-Loop Platform for Deploying Multimodal LLMs in Autonomous Driving