Wide Residual Networks

Wide Residual Networks

23 May 2016 | Sergey Zagoruyko and Nikos Komodakis
This paper explores the architecture of ResNet blocks and proposes a novel architecture called Wide Residual Networks (WRNs) to address the challenges of training very deep residual networks. The authors conduct a detailed experimental study on the structure of ResNet blocks, focusing on the width and depth of these blocks. They find that increasing the width of residual networks, rather than depth, significantly improves performance and efficiency. Specifically, they demonstrate that a simple 16-layer deep WRN outperforms both accuracy and efficiency of all previous deep residual networks, including thousand-layer deep networks, achieving state-of-the-art results on datasets such as CIFAR-10, CIFAR-100, and SVHN. The paper also introduces a new way of using dropout within deep residual networks to regularize training and prevent overfitting. The authors conclude that the main power of residual networks lies in the residual blocks, and that the effect of depth is supplementary. Their findings suggest that widening residual networks is a more effective approach to improving performance compared to increasing depth.This paper explores the architecture of ResNet blocks and proposes a novel architecture called Wide Residual Networks (WRNs) to address the challenges of training very deep residual networks. The authors conduct a detailed experimental study on the structure of ResNet blocks, focusing on the width and depth of these blocks. They find that increasing the width of residual networks, rather than depth, significantly improves performance and efficiency. Specifically, they demonstrate that a simple 16-layer deep WRN outperforms both accuracy and efficiency of all previous deep residual networks, including thousand-layer deep networks, achieving state-of-the-art results on datasets such as CIFAR-10, CIFAR-100, and SVHN. The paper also introduces a new way of using dropout within deep residual networks to regularize training and prevent overfitting. The authors conclude that the main power of residual networks lies in the residual blocks, and that the effect of depth is supplementary. Their findings suggest that widening residual networks is a more effective approach to improving performance compared to increasing depth.
Reach us at info@study.space
[slides and audio] Wide Residual Networks