Bench2Drive: Towards Multi-Ability Benchmarking of Closed-Loop End-To-End Autonomous Driving

Bench2Drive: Towards Multi-Ability Benchmarking of Closed-Loop End-To-End Autonomous Driving

11 Jun 2024 | Xiaosong Jia*, Zhenjie Yang*, Qifeng Li*, Zhiyuan Zhang*, Junchi Yan†
**Bench2Drive: Towards Multi-Ability Benchmarking of Closed-Loop End-To-End Autonomous Driving** In the era of rapidly scaling foundation models, end-to-end autonomous driving (E2E-AD) is approaching a transformative threshold. However, existing E2E-AD methods are primarily evaluated under open-loop log-replay manners with metrics like L2 errors and collision rates, which fail to fully reflect their driving performance. Closed-loop evaluations, while more realistic, often use fixed routes and driving scores, which are prone to high variance and lack detailed skill assessments. To address these issues, Bench2Drive introduces a comprehensive, realistic, and fair closed-loop benchmark for evaluating E2E-AD systems. Bench2Drive features a large-scale, fully annotated dataset of 2 million frames collected from 10,000 short clips across 44 interactive scenarios, 23 weather conditions, and 12 towns in CARLA v2. The evaluation protocol includes 220 short routes, each focusing on a specific scenario, allowing for detailed skill assessments. Key features of Bench2Drive include: - **Comprehensive Scenario Coverage**: 44 interactive scenarios covering various driving conditions. - **Granular Skill Assessment**: 220 short routes for isolated skill assessments. - **Closed-Loop Evaluation Protocol**: Directly evaluates the system's actions on the environment. - **Diverse Large-Scale Official Training Data**: 2 million annotated frames for fair algorithm-level comparisons. The benchmark is designed to provide a detailed understanding of the strengths and weaknesses of different E2E-AD methods, enabling targeted improvements and refined technology development. The authors implement several state-of-the-art E2E-AD models and evaluate them in Bench2Drive, highlighting the limitations of open-loop metrics and the importance of closed-loop evaluation. **Conclusion:** Bench2Drive is a pioneering benchmark for evaluating E2E-AD systems in a closed-loop environment, providing a comprehensive and realistic testing platform. It offers insights into the current status and future directions of E2E-AD research.**Bench2Drive: Towards Multi-Ability Benchmarking of Closed-Loop End-To-End Autonomous Driving** In the era of rapidly scaling foundation models, end-to-end autonomous driving (E2E-AD) is approaching a transformative threshold. However, existing E2E-AD methods are primarily evaluated under open-loop log-replay manners with metrics like L2 errors and collision rates, which fail to fully reflect their driving performance. Closed-loop evaluations, while more realistic, often use fixed routes and driving scores, which are prone to high variance and lack detailed skill assessments. To address these issues, Bench2Drive introduces a comprehensive, realistic, and fair closed-loop benchmark for evaluating E2E-AD systems. Bench2Drive features a large-scale, fully annotated dataset of 2 million frames collected from 10,000 short clips across 44 interactive scenarios, 23 weather conditions, and 12 towns in CARLA v2. The evaluation protocol includes 220 short routes, each focusing on a specific scenario, allowing for detailed skill assessments. Key features of Bench2Drive include: - **Comprehensive Scenario Coverage**: 44 interactive scenarios covering various driving conditions. - **Granular Skill Assessment**: 220 short routes for isolated skill assessments. - **Closed-Loop Evaluation Protocol**: Directly evaluates the system's actions on the environment. - **Diverse Large-Scale Official Training Data**: 2 million annotated frames for fair algorithm-level comparisons. The benchmark is designed to provide a detailed understanding of the strengths and weaknesses of different E2E-AD methods, enabling targeted improvements and refined technology development. The authors implement several state-of-the-art E2E-AD models and evaluate them in Bench2Drive, highlighting the limitations of open-loop metrics and the importance of closed-loop evaluation. **Conclusion:** Bench2Drive is a pioneering benchmark for evaluating E2E-AD systems in a closed-loop environment, providing a comprehensive and realistic testing platform. It offers insights into the current status and future directions of E2E-AD research.
Reach us at info@study.space
[slides] Bench2Drive%3A Towards Multi-Ability Benchmarking of Closed-Loop End-To-End Autonomous Driving | StudySpace