Understanding Continuous Deep Q-Learning with Model-based Acceleration

This paper explores methods to reduce the sample complexity of deep reinforcement learning for continuous control tasks. The authors propose two complementary techniques: a continuous variant of Q-learning called Normalized Advantage Functions (NAF), and the use of learned models to accelerate model-free reinforcement learning. NAF allows Q-learning to be applied to continuous domains, improving performance on simulated robotic control tasks. The authors also demonstrate that iteratively refitted local linear models can significantly enhance the efficiency of model-free reinforcement learning, achieving faster learning in domains where such models are applicable. The paper evaluates these methods on a series of simulated robotic tasks, showing that NAF outperforms DDPG in most tasks, particularly those requiring precision. The combination of NAF and local linear models provides substantial improvements in sample efficiency, making it a promising approach for efficient learning in real-world robotic tasks.This paper explores methods to reduce the sample complexity of deep reinforcement learning for continuous control tasks. The authors propose two complementary techniques: a continuous variant of Q-learning called Normalized Advantage Functions (NAF), and the use of learned models to accelerate model-free reinforcement learning. NAF allows Q-learning to be applied to continuous domains, improving performance on simulated robotic control tasks. The authors also demonstrate that iteratively refitted local linear models can significantly enhance the efficiency of model-free reinforcement learning, achieving faster learning in domains where such models are applicable. The paper evaluates these methods on a series of simulated robotic tasks, showing that NAF outperforms DDPG in most tasks, particularly those requiring precision. The combination of NAF and local linear models provides substantial improvements in sample efficiency, making it a promising approach for efficient learning in real-world robotic tasks.

Continuous Deep Q-Learning with Model-based Acceleration

2 Mar 2016 | Shixiang Gu, Timothy Lillicrap, Ilya Sutskever, Sergey Levine