NOISY NETWORKS FOR EXPLORATION

NOISY NETWORKS FOR EXPLORATION

9 Jul 2019 | Meire Fortunato*, Mohammad Gheshlaghi Azar*, Bilal Piot *, Jacob Menick, Matteo Hessel, Ian Osband, Alex Graves, Volodymyr Mnih, Remi Munos, Demis Hassabis, Olivier Pietquin, Charles Blundell, Shane Legg
NoisyNet is a deep reinforcement learning agent that introduces parametric noise into its weights to enhance exploration. Unlike traditional exploration methods such as ε-greedy or entropy regularization, NoisyNet uses stochasticity in the agent's policy to drive exploration. The noise parameters are learned alongside the network weights using gradient descent, making the approach computationally efficient and easy to implement. Experiments on 57 Atari games show that replacing conventional exploration heuristics with NoisyNet significantly improves performance, often achieving super-human results. NoisyNet introduces noise into the network weights, which are sampled from a noise distribution. The variance of this noise is a learnable parameter that is optimized using the reinforcement learning loss function. This approach differs from parameter compression techniques and Thompson sampling, as it does not maintain an explicit distribution over weights but instead injects noise into the parameters and tunes its intensity automatically. NoisyNet can be adapted to various deep reinforcement learning algorithms, including DQN, Dueling, and A3C. The approach is particularly effective in environments where exploration is challenging, as it allows the agent to discover new behaviors through the stochasticity of the noise. The noise injection is not limited to Gaussian distributions, and the method is flexible enough to be applied to different types of neural networks. Experiments demonstrate that NoisyNet outperforms baseline methods in terms of performance across a wide range of Atari games. The method achieves significant improvements in both mean and median human-normalized scores, with some games showing an order of magnitude improvement over vanilla agents. The results suggest that NoisyNet provides a more effective exploration strategy compared to traditional methods, as it allows the agent to explore the state space more thoroughly and discover novel behaviors. The noise injection in NoisyNet is not only effective for exploration but also has implications for the learning process. The noise parameters evolve throughout training, and their evolution varies depending on the game and the initial conditions. This suggests that NoisyNet produces a problem-specific exploration strategy rather than a fixed one, which is a key advantage in complex environments. The method is also computationally efficient, as the noise injection does not significantly increase the computational overhead, making it suitable for a wide range of applications.NoisyNet is a deep reinforcement learning agent that introduces parametric noise into its weights to enhance exploration. Unlike traditional exploration methods such as ε-greedy or entropy regularization, NoisyNet uses stochasticity in the agent's policy to drive exploration. The noise parameters are learned alongside the network weights using gradient descent, making the approach computationally efficient and easy to implement. Experiments on 57 Atari games show that replacing conventional exploration heuristics with NoisyNet significantly improves performance, often achieving super-human results. NoisyNet introduces noise into the network weights, which are sampled from a noise distribution. The variance of this noise is a learnable parameter that is optimized using the reinforcement learning loss function. This approach differs from parameter compression techniques and Thompson sampling, as it does not maintain an explicit distribution over weights but instead injects noise into the parameters and tunes its intensity automatically. NoisyNet can be adapted to various deep reinforcement learning algorithms, including DQN, Dueling, and A3C. The approach is particularly effective in environments where exploration is challenging, as it allows the agent to discover new behaviors through the stochasticity of the noise. The noise injection is not limited to Gaussian distributions, and the method is flexible enough to be applied to different types of neural networks. Experiments demonstrate that NoisyNet outperforms baseline methods in terms of performance across a wide range of Atari games. The method achieves significant improvements in both mean and median human-normalized scores, with some games showing an order of magnitude improvement over vanilla agents. The results suggest that NoisyNet provides a more effective exploration strategy compared to traditional methods, as it allows the agent to explore the state space more thoroughly and discover novel behaviors. The noise injection in NoisyNet is not only effective for exploration but also has implications for the learning process. The noise parameters evolve throughout training, and their evolution varies depending on the game and the initial conditions. This suggests that NoisyNet produces a problem-specific exploration strategy rather than a fixed one, which is a key advantage in complex environments. The method is also computationally efficient, as the noise injection does not significantly increase the computational overhead, making it suitable for a wide range of applications.
Reach us at info@study.space