Identity Mappings in Deep Residual Networks

Identity Mappings in Deep Residual Networks

25 Jul 2016 | Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun
This paper by Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun from Microsoft Research explores the propagation formulations behind the residual building blocks in deep residual networks (ResNets). The authors analyze that identity mappings and after-addition activations allow direct propagation of signals between any two blocks in the network, both forward and backward. A series of ablation experiments support the importance of these identity mappings, showing that they make training easier and improve generalization. The paper introduces a new residual unit design that uses identity mappings for the skip connection and pre-activation for the after-addition activation, which leads to improved results on datasets like CIFAR-10, CIFAR-100, and ImageNet. The proposed ResNet-1001 achieves a 4.62% error rate on CIFAR-10 and a 200-layer ResNet achieves better results on ImageNet compared to the original ResNet. The code for the experiments is available at <https://github.com/KaimingHe/resnet-1k-layers>.This paper by Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun from Microsoft Research explores the propagation formulations behind the residual building blocks in deep residual networks (ResNets). The authors analyze that identity mappings and after-addition activations allow direct propagation of signals between any two blocks in the network, both forward and backward. A series of ablation experiments support the importance of these identity mappings, showing that they make training easier and improve generalization. The paper introduces a new residual unit design that uses identity mappings for the skip connection and pre-activation for the after-addition activation, which leads to improved results on datasets like CIFAR-10, CIFAR-100, and ImageNet. The proposed ResNet-1001 achieves a 4.62% error rate on CIFAR-10 and a 200-layer ResNet achieves better results on ImageNet compared to the original ResNet. The code for the experiments is available at <https://github.com/KaimingHe/resnet-1k-layers>.
Reach us at info@study.space
Understanding Identity Mappings in Deep Residual Networks