VADv2: End-to-End Vectorized Autonomous Driving via Probabilistic Planning

VADv2: End-to-End Vectorized Autonomous Driving via Probabilistic Planning

20 Feb 2024 | Shaoyu Chen, Bo Jiang, Hao Gao, Bencheng Liao, Qing Xu, Qian Zhang, Chang Huang, Wenyu Liu, Xinggang Wang
VADv2 is an end-to-end autonomous driving model based on probabilistic planning. It addresses the challenge of planning uncertainty by learning a probabilistic distribution of actions from large-scale driving demonstrations. VADv2 takes multi-view image sequences as input, transforms sensor data into environmental token embeddings, and outputs a probabilistic distribution of actions. It samples one action to control the vehicle, achieving state-of-the-art closed-loop performance on the CARLA Town05 benchmark, significantly outperforming existing methods. It runs stably in a fully end-to-end manner, even without a rule-based wrapper. VADv2 uses probabilistic planning to model the uncertainty in planning, which is different from previous deterministic approaches. It models the planning policy as an environment-conditioned non-stationary stochastic process, formulated as p(a|o), where o is the historical and current observations of the driving environment, and a is a candidate planning action. The planning action space is a high-dimensional continuous spatiotemporal space, and VADv2 uses a probabilistic field function to model the mapping from the action space to the probabilistic distribution. It discretizes the planning action space to a large planning vocabulary and uses mass driving demonstrations to learn the probability distribution of planning actions based on the planning vocabulary. VADv2 also incorporates scene tokens, which encode important scene elements into high-dimensional features, and the planning tokens interact with the scene tokens to learn both dynamic and static information about the driving scene. The model is trained with three types of supervision: distribution loss, conflict loss, and scene token loss. It achieves high performance in closed-loop evaluation on the CARLA Town05 Long and Town05 Short benchmarks, demonstrating its comprehensive driving ability in complex scenarios. VADv2 also performs well in open-loop evaluation, showing that the learned policy drives similarly to expert demonstrations. The model's probabilistic planning approach is effective in handling uncertainty and provides more accurate and safe planning performance. VADv2 is the first work to use probabilistic modeling to fit the continuous planning action space, which is different from previous practices that use deterministic modeling for planning. VADv2 is inspired by large language models, which learn the context-conditioned probabilistic distribution of the next word from a large-scale corpus and sample one word from the distribution. VADv2 models the planning policy as an environment-conditioned non-stationary stochastic process, and discretizes the action space to generate a planning vocabulary, approximates the probabilistic distribution based on large-scale driving demonstrations, and samples one action from the distribution at each time step to control the vehicle. VADv2 is a fully end-to-end driving model that achieves state-of-the-art closed-loop performance on the CARLA Town05 benchmark.VADv2 is an end-to-end autonomous driving model based on probabilistic planning. It addresses the challenge of planning uncertainty by learning a probabilistic distribution of actions from large-scale driving demonstrations. VADv2 takes multi-view image sequences as input, transforms sensor data into environmental token embeddings, and outputs a probabilistic distribution of actions. It samples one action to control the vehicle, achieving state-of-the-art closed-loop performance on the CARLA Town05 benchmark, significantly outperforming existing methods. It runs stably in a fully end-to-end manner, even without a rule-based wrapper. VADv2 uses probabilistic planning to model the uncertainty in planning, which is different from previous deterministic approaches. It models the planning policy as an environment-conditioned non-stationary stochastic process, formulated as p(a|o), where o is the historical and current observations of the driving environment, and a is a candidate planning action. The planning action space is a high-dimensional continuous spatiotemporal space, and VADv2 uses a probabilistic field function to model the mapping from the action space to the probabilistic distribution. It discretizes the planning action space to a large planning vocabulary and uses mass driving demonstrations to learn the probability distribution of planning actions based on the planning vocabulary. VADv2 also incorporates scene tokens, which encode important scene elements into high-dimensional features, and the planning tokens interact with the scene tokens to learn both dynamic and static information about the driving scene. The model is trained with three types of supervision: distribution loss, conflict loss, and scene token loss. It achieves high performance in closed-loop evaluation on the CARLA Town05 Long and Town05 Short benchmarks, demonstrating its comprehensive driving ability in complex scenarios. VADv2 also performs well in open-loop evaluation, showing that the learned policy drives similarly to expert demonstrations. The model's probabilistic planning approach is effective in handling uncertainty and provides more accurate and safe planning performance. VADv2 is the first work to use probabilistic modeling to fit the continuous planning action space, which is different from previous practices that use deterministic modeling for planning. VADv2 is inspired by large language models, which learn the context-conditioned probabilistic distribution of the next word from a large-scale corpus and sample one word from the distribution. VADv2 models the planning policy as an environment-conditioned non-stationary stochastic process, and discretizes the action space to generate a planning vocabulary, approximates the probabilistic distribution based on large-scale driving demonstrations, and samples one action from the distribution at each time step to control the vehicle. VADv2 is a fully end-to-end driving model that achieves state-of-the-art closed-loop performance on the CARLA Town05 benchmark.
Reach us at info@study.space