[slides and audio] Simple Hierarchical Planning with Diffusion

The paper introduces the *Hierarchical Diffuser* (HD), a novel framework that combines hierarchical and diffusion-based planning methods to address the challenges of long-horizon tasks in offline reinforcement learning. The HD model is designed to improve the efficiency and generalization capabilities of diffusion-based planning by adopting a "jumpy" planning strategy at the higher level, which allows for a larger receptive field at a lower computational cost. The high-level planner generates subgoals, while the low-level planner refines these subgoals into detailed action sequences. This hierarchical approach not only enhances the model's ability to capture optimal behavior from the offline dataset but also improves its performance on compositional out-of-distribution tasks. Empirical evaluations on standard offline reinforcement learning benchmarks demonstrate that HD outperforms existing methods in terms of training and planning speed, as well as task performance. The paper also explores the generalization capabilities of HD, showing its superior performance in handling compositional out-of-distribution tasks. The theoretical analysis provides insights into the trade-offs between the receptive field size and the kernel size, and the experimental results highlight the benefits of the hierarchical planning approach.The paper introduces the *Hierarchical Diffuser* (HD), a novel framework that combines hierarchical and diffusion-based planning methods to address the challenges of long-horizon tasks in offline reinforcement learning. The HD model is designed to improve the efficiency and generalization capabilities of diffusion-based planning by adopting a "jumpy" planning strategy at the higher level, which allows for a larger receptive field at a lower computational cost. The high-level planner generates subgoals, while the low-level planner refines these subgoals into detailed action sequences. This hierarchical approach not only enhances the model's ability to capture optimal behavior from the offline dataset but also improves its performance on compositional out-of-distribution tasks. Empirical evaluations on standard offline reinforcement learning benchmarks demonstrate that HD outperforms existing methods in terms of training and planning speed, as well as task performance. The paper also explores the generalization capabilities of HD, showing its superior performance in handling compositional out-of-distribution tasks. The theoretical analysis provides insights into the trade-offs between the receptive field size and the kernel size, and the experimental results highlight the benefits of the hierarchical planning approach.

SIMPLE HIERARCHICAL PLANNING WITH DIFFUSION

5 Jan 2024 | Chang Chen, Fei Deng, Kenji Kawaguchi, Caglar Gulcehre, Sungjin Ahn