[slides and audio] Learning to reinforcement learn

This paper introduces a novel approach called deep meta-reinforcement learning (deep meta-RL) to address the limitations of current deep reinforcement learning (RL) systems, which require large amounts of training data and struggle to adapt to new tasks. Deep meta-RL leverages recurrent neural networks to implement a secondary RL procedure that can adapt to new tasks more efficiently and with less data. The authors demonstrate the effectiveness of deep meta-RL through seven proof-of-concept experiments, including bandit problems and Markov decision processes (MDPs). The experiments show that deep meta-RL can learn to balance exploration and exploitation, exploit task structure, and generalize to new tasks. The paper also discusses the scalability and potential implications of deep meta-RL for neuroscience.This paper introduces a novel approach called deep meta-reinforcement learning (deep meta-RL) to address the limitations of current deep reinforcement learning (RL) systems, which require large amounts of training data and struggle to adapt to new tasks. Deep meta-RL leverages recurrent neural networks to implement a secondary RL procedure that can adapt to new tasks more efficiently and with less data. The authors demonstrate the effectiveness of deep meta-RL through seven proof-of-concept experiments, including bandit problems and Markov decision processes (MDPs). The experiments show that deep meta-RL can learn to balance exploration and exploitation, exploit task structure, and generalize to new tasks. The paper also discusses the scalability and potential implications of deep meta-RL for neuroscience.

LEARNING TO REINFORCEMENT LEARN

23 Jan 2017 | JX Wang, Z Kurth-Nelson, D Tirumala, H Soyer, JZ Leibo, R Munos, C Blundell, D Kumaran, M Botvinick