Dueling Network Architectures for Deep Reinforcement Learning

Dueling Network Architectures for Deep Reinforcement Learning

5 Apr 2016 | Ziyu Wang, Tom Schaul, Matteo Hessel, Hado van Hasselt, Marc Lanctot, Nando de Freitas
The paper introduces a new neural network architecture called the dueling network for model-free reinforcement learning. This architecture separates the representation of state values and state-dependent action advantages, allowing for better policy evaluation, especially in scenarios with many similar-valued actions. The dueling network consists of two streams that estimate the value and advantage functions, respectively, while sharing a common convolutional feature learning module. These streams are combined using a special aggregating layer to produce the state-action value function $Q$. The authors demonstrate that the dueling network outperforms conventional single-stream Q-networks in policy evaluation tasks and significantly improves performance on the Atari 2600 domain, achieving human-level performance on 42 out of 57 games. The dueling architecture is also shown to be complementary to other algorithmic improvements, such as prioritized experience replay, further enhancing its effectiveness.The paper introduces a new neural network architecture called the dueling network for model-free reinforcement learning. This architecture separates the representation of state values and state-dependent action advantages, allowing for better policy evaluation, especially in scenarios with many similar-valued actions. The dueling network consists of two streams that estimate the value and advantage functions, respectively, while sharing a common convolutional feature learning module. These streams are combined using a special aggregating layer to produce the state-action value function $Q$. The authors demonstrate that the dueling network outperforms conventional single-stream Q-networks in policy evaluation tasks and significantly improves performance on the Atari 2600 domain, achieving human-level performance on 42 out of 57 games. The dueling architecture is also shown to be complementary to other algorithmic improvements, such as prioritized experience replay, further enhancing its effectiveness.
Reach us at info@study.space
[slides and audio] Dueling Network Architectures for Deep Reinforcement Learning