Understanding ChatScene%3A Knowledge-Enabled Safety-Critical Scenario Generation for Autonomous Vehicles

**ChatScene: Knowledge-Enabled Safety-Critical Scenario Generation for Autonomous Vehicles** **Authors:** Jiawei Zhang **Abstract:** ChatScene is a Large Language Model (LLM)-based agent designed to generate safety-critical scenarios for autonomous vehicles. Given unstructured language instructions, ChatScene first generates textually described traffic scenarios using LLMs. These descriptions are then broken down into sub-descriptions for specific details such as vehicle behaviors and locations. The agent transforms these sub-scenarios into domain-specific languages, generating actual code for prediction and control in simulators, facilitating the creation of diverse and complex scenarios within the CARLA simulation environment. A key component is a comprehensive knowledge retrieval system that efficiently translates textual descriptions into corresponding domain-specific code snippets by training a knowledge database containing scenario descriptions and code pairs. Extensive experimental results show that ChatScene improves the safety of autonomous vehicles, with scenarios generated showing a 15% increase in collision rates compared to state-of-the-art baselines. Additionally, fine-tuning different reinforcement learning-based autonomous driving models using generated scenarios reduces collision rates by 9%, surpassing current SOTA methods. ChatScene effectively bridges the gap between textual descriptions and practical simulations, providing a unified way to generate safety-critical scenarios for testing and improvement. **Introduction:** The paper addresses the need for exhaustive testing of autonomous vehicles (AVs) across all conceivable safety-critical scenarios to ensure their safe and reliable operation. Traditional real-world testing is costly and requires extensive data collection, making simulated scenarios a cost-effective alternative. While existing methods generate specific types of scenarios, they often lack the complexity and diversity of real-world situations. LLMs, with their vast knowledge, offer a promising solution by generating more realistic and diverse scenarios. The paper introduces ChatScene, an LLM-based agent that generates safety-critical scenarios and converts them into executable simulations using the Scenic programming language. **Methodology:** ChatScene's methodology involves: 1. **Generation of Textual Descriptions:** The LLM agent generates natural language descriptions of safety-critical scenarios. 2. **Parsing and Extraction:** These descriptions are parsed to extract detailed characteristics, such as adversarial behaviors. 3. **Retrieval of Code Snippets:** The extracted characteristics are encoded and used to retrieve corresponding Scenic code snippets from a pre-constructed database. 4. **Scenario Rendering:** The retrieved snippets are assembled into a complete Scenic script, which is executed in the CARLA simulation environment. **Experiments:** - **Safety-Critical Scenario Generation:** ChatScene outperforms existing benchmarks in generating adversarial scenarios, increasing collision rates by 15%. - **Adversarial Training:** Finetuning ego vehicles using ChatScene-generated scenarios reduces collision rates by 51% and overall scores by 43%, demonstrating the effectiveness of the generated scenarios in enhancing the robustness of autonomous vehicles. **Conclusion:** ChatScene is a novel LLM-based agent that significantly enhances the**ChatScene: Knowledge-Enabled Safety-Critical Scenario Generation for Autonomous Vehicles** **Authors:** Jiawei Zhang **Abstract:** ChatScene is a Large Language Model (LLM)-based agent designed to generate safety-critical scenarios for autonomous vehicles. Given unstructured language instructions, ChatScene first generates textually described traffic scenarios using LLMs. These descriptions are then broken down into sub-descriptions for specific details such as vehicle behaviors and locations. The agent transforms these sub-scenarios into domain-specific languages, generating actual code for prediction and control in simulators, facilitating the creation of diverse and complex scenarios within the CARLA simulation environment. A key component is a comprehensive knowledge retrieval system that efficiently translates textual descriptions into corresponding domain-specific code snippets by training a knowledge database containing scenario descriptions and code pairs. Extensive experimental results show that ChatScene improves the safety of autonomous vehicles, with scenarios generated showing a 15% increase in collision rates compared to state-of-the-art baselines. Additionally, fine-tuning different reinforcement learning-based autonomous driving models using generated scenarios reduces collision rates by 9%, surpassing current SOTA methods. ChatScene effectively bridges the gap between textual descriptions and practical simulations, providing a unified way to generate safety-critical scenarios for testing and improvement. **Introduction:** The paper addresses the need for exhaustive testing of autonomous vehicles (AVs) across all conceivable safety-critical scenarios to ensure their safe and reliable operation. Traditional real-world testing is costly and requires extensive data collection, making simulated scenarios a cost-effective alternative. While existing methods generate specific types of scenarios, they often lack the complexity and diversity of real-world situations. LLMs, with their vast knowledge, offer a promising solution by generating more realistic and diverse scenarios. The paper introduces ChatScene, an LLM-based agent that generates safety-critical scenarios and converts them into executable simulations using the Scenic programming language. **Methodology:** ChatScene's methodology involves: 1. **Generation of Textual Descriptions:** The LLM agent generates natural language descriptions of safety-critical scenarios. 2. **Parsing and Extraction:** These descriptions are parsed to extract detailed characteristics, such as adversarial behaviors. 3. **Retrieval of Code Snippets:** The extracted characteristics are encoded and used to retrieve corresponding Scenic code snippets from a pre-constructed database. 4. **Scenario Rendering:** The retrieved snippets are assembled into a complete Scenic script, which is executed in the CARLA simulation environment. **Experiments:** - **Safety-Critical Scenario Generation:** ChatScene outperforms existing benchmarks in generating adversarial scenarios, increasing collision rates by 15%. - **Adversarial Training:** Finetuning ego vehicles using ChatScene-generated scenarios reduces collision rates by 51% and overall scores by 43%, demonstrating the effectiveness of the generated scenarios in enhancing the robustness of autonomous vehicles. **Conclusion:** ChatScene is a novel LLM-based agent that significantly enhances the

ChatScene: Knowledge-Enabled Safety-Critical Scenario Generation for Autonomous Vehicles

22 May 2024 | Jiawei Zhang, Chejian Xu, Bo Li