Diffusion Policy: Visuomotor Policy Learning via Action Diffusion

Diffusion Policy: Visuomotor Policy Learning via Action Diffusion

14 Mar 2024 | Cheng Chi*1, Zhenjia Xu*1, Siyuan Feng2, Eric Cousineau2, Yilun Du3, Benjamin Burchfiel2, Russ Tedrake 2,3, Shuran Song1,4
This paper introduces Diffusion Policy, a novel approach to generating robot behavior by representing the visuomotor policy as a conditional denoising diffusion process. The method is benchmarked across 15 tasks from 4 different robot manipulation benchmarks, consistently outperforming existing state-of-the-art methods with an average improvement of 46.9%. Diffusion Policy learns the gradient of the action-distribution score function and optimizes it during inference via stochastic Langevin dynamics steps. Key advantages include handling multimodal action distributions, suitability for high-dimensional action spaces, and stable training. The paper presents several technical contributions, including receding horizon control, visual conditioning, and a time-series diffusion transformer. The code, data, and training details are publicly available. The evaluation covers both simulated and real-world environments, with tasks ranging from 2DoF to 6DoF actions, single- and multi-task benchmarks, and fully- and under-actuated systems. The results demonstrate the effectiveness of Diffusion Policy in various complex real-world manipulation tasks.This paper introduces Diffusion Policy, a novel approach to generating robot behavior by representing the visuomotor policy as a conditional denoising diffusion process. The method is benchmarked across 15 tasks from 4 different robot manipulation benchmarks, consistently outperforming existing state-of-the-art methods with an average improvement of 46.9%. Diffusion Policy learns the gradient of the action-distribution score function and optimizes it during inference via stochastic Langevin dynamics steps. Key advantages include handling multimodal action distributions, suitability for high-dimensional action spaces, and stable training. The paper presents several technical contributions, including receding horizon control, visual conditioning, and a time-series diffusion transformer. The code, data, and training details are publicly available. The evaluation covers both simulated and real-world environments, with tasks ranging from 2DoF to 6DoF actions, single- and multi-task benchmarks, and fully- and under-actuated systems. The results demonstrate the effectiveness of Diffusion Policy in various complex real-world manipulation tasks.
Reach us at info@study.space