[slides] When to Trust Your Model%3A Model-Based Policy Optimization

This paper explores the role of model usage in policy optimization within model-based reinforcement learning. The authors formulate and analyze a model-based reinforcement learning algorithm with a guarantee of monotonic improvement at each step, but find that this analysis is overly pessimistic. They introduce the concept of empirical model generalization to justify the use of models in practice. The main contribution is MBPO (Model-Based Policy Optimization), which uses short model-generated rollouts branched from real data to achieve better performance than other model-based methods without the usual pitfalls. MBPO surpasses the sample efficiency of prior model-based methods, matches the asymptotic performance of the best model-free algorithms, and scales to long horizons that cause other model-based methods to fail. The paper also includes a theoretical analysis and empirical evaluations to support these findings.This paper explores the role of model usage in policy optimization within model-based reinforcement learning. The authors formulate and analyze a model-based reinforcement learning algorithm with a guarantee of monotonic improvement at each step, but find that this analysis is overly pessimistic. They introduce the concept of empirical model generalization to justify the use of models in practice. The main contribution is MBPO (Model-Based Policy Optimization), which uses short model-generated rollouts branched from real data to achieve better performance than other model-based methods without the usual pitfalls. MBPO surpasses the sample efficiency of prior model-based methods, matches the asymptotic performance of the best model-free algorithms, and scales to long horizons that cause other model-based methods to fail. The paper also includes a theoretical analysis and empirical evaluations to support these findings.

When to Trust Your Model: Model-Based Policy Optimization

29 Nov 2021 | Michael Janner, Justin Fu, Marvin Zhang, Sergey Levine