6 Mar 2024 | Xiao Ma, Sumit Patidar, Iain Haughton, Stephen James
This paper introduces Hierarchical Diffusion Policy (HDP), a hierarchical agent designed for multi-task robotic manipulation. HDP decomposes the manipulation policy into two levels: a high-level task-planning agent that predicts a distant next-best end-effector pose (NBP) and a low-level goal-conditioned diffusion policy that generates optimal motion trajectories. The high-level agent, Perceiver-Actor (PerAct), takes 3D visual observations and language instructions as inputs to predict the NBP, while the low-level agent, Robot Kinematics Diffuser (RK-Diffuser), generates joint position trajectories conditioned on the predicted NBP. RK-Diffuser distills the accurate but less reliable end-effector pose trajectory into joint position trajectories using differentiable robot kinematics, ensuring both kinematics awareness and high accuracy. Empirical results show that HDP achieves significantly higher success rates than state-of-the-art methods in both simulation and real-world tasks, demonstrating its effectiveness in handling long-horizon task planning and fine-grained low-level actions.This paper introduces Hierarchical Diffusion Policy (HDP), a hierarchical agent designed for multi-task robotic manipulation. HDP decomposes the manipulation policy into two levels: a high-level task-planning agent that predicts a distant next-best end-effector pose (NBP) and a low-level goal-conditioned diffusion policy that generates optimal motion trajectories. The high-level agent, Perceiver-Actor (PerAct), takes 3D visual observations and language instructions as inputs to predict the NBP, while the low-level agent, Robot Kinematics Diffuser (RK-Diffuser), generates joint position trajectories conditioned on the predicted NBP. RK-Diffuser distills the accurate but less reliable end-effector pose trajectory into joint position trajectories using differentiable robot kinematics, ensuring both kinematics awareness and high accuracy. Empirical results show that HDP achieves significantly higher success rates than state-of-the-art methods in both simulation and real-world tasks, demonstrating its effectiveness in handling long-horizon task planning and fine-grained low-level actions.