June 03-05, 2024 | Yuan Sun, Salami Pargoo, Navid, Peter J. Jin, Jorge Ortiz
This paper presents a human-centric approach to optimize autonomous driving safety by integrating Large Language Models (LLMs) with Reinforcement Learning from Human Feedback (RLHF). The research aims to enhance the safety of autonomous driving systems by incorporating human preferences into the training process. Traditional reinforcement learning (RL) often struggles with complex autonomous driving tasks, while RLHF is more effective in training large language models. However, in autonomous driving, direct human preference feedback is impractical to obtain frame-by-frame. To address this, the authors propose a framework that combines RLHF with LLMs to simulate real-world driving scenarios and optimize the autonomous driving model.
The framework uses a multi-agent simulation environment that includes human drivers, pedestrians, and LLM agents. These agents interact to generate training data for the autonomous car model. The system collects multimodal data from various sensors, including physical and physiological signals, to train the model. The LLM agent helps interpret this data and translate it into preferences that guide the RLHF training loop. This approach allows the autonomous driving model to learn human preferences and adapt to real-world driving conditions.
The research also includes an initial implementation using the GPT-4 interface to demonstrate the integration of LLMs with the car simulation system. The LLM agent can mimic human driving behavior and assist in collision avoidance. The authors plan to test their framework in real-world city test beds in New York City and New Brunswick. They aim to evaluate the robustness of their algorithm using real-life data and improve the safety and performance of autonomous driving systems.
The study concludes that integrating LLMs with RLHF can enhance the safety and effectiveness of autonomous driving models. Future work includes exploring different interfaces for the GPT-4 system, evaluating the method across various multimodal models, and conducting human evaluations with diverse driving skills. The ultimate goal is to develop a safe driving model that can help autonomous vehicles navigate real-world roads and contribute to overall road safety.This paper presents a human-centric approach to optimize autonomous driving safety by integrating Large Language Models (LLMs) with Reinforcement Learning from Human Feedback (RLHF). The research aims to enhance the safety of autonomous driving systems by incorporating human preferences into the training process. Traditional reinforcement learning (RL) often struggles with complex autonomous driving tasks, while RLHF is more effective in training large language models. However, in autonomous driving, direct human preference feedback is impractical to obtain frame-by-frame. To address this, the authors propose a framework that combines RLHF with LLMs to simulate real-world driving scenarios and optimize the autonomous driving model.
The framework uses a multi-agent simulation environment that includes human drivers, pedestrians, and LLM agents. These agents interact to generate training data for the autonomous car model. The system collects multimodal data from various sensors, including physical and physiological signals, to train the model. The LLM agent helps interpret this data and translate it into preferences that guide the RLHF training loop. This approach allows the autonomous driving model to learn human preferences and adapt to real-world driving conditions.
The research also includes an initial implementation using the GPT-4 interface to demonstrate the integration of LLMs with the car simulation system. The LLM agent can mimic human driving behavior and assist in collision avoidance. The authors plan to test their framework in real-world city test beds in New York City and New Brunswick. They aim to evaluate the robustness of their algorithm using real-life data and improve the safety and performance of autonomous driving systems.
The study concludes that integrating LLMs with RLHF can enhance the safety and effectiveness of autonomous driving models. Future work includes exploring different interfaces for the GPT-4 system, evaluating the method across various multimodal models, and conducting human evaluations with diverse driving skills. The ultimate goal is to develop a safe driving model that can help autonomous vehicles navigate real-world roads and contribute to overall road safety.