VOL. 43, NO. 2, FEB. 2021 | Shang-Hua Gao*, Ming-Ming Cheng*, Kai Zhao, Xin-Yu Zhang, Ming-Hsuan Yang, and Philip Torr
The paper introduces Res2Net, a novel building block for convolutional neural networks (CNNs) that enhances multi-scale feature representation at a granular level. Unlike existing methods that improve multi-scale representation at the layer-wise level, Res2Net increases the range of receptive fields for each network layer by constructing hierarchical residual-like connections within a single residual block. The proposed block can be integrated into state-of-the-art backbone CNN models such as ResNet, ResNeXt, and DLA. Experimental results on datasets like CIFAR-100 and ImageNet demonstrate consistent performance gains over baseline models. Ablation studies and evaluations on tasks such as object detection, class activation mapping, and salient object detection further validate the superiority of Res2Net over state-of-the-art methods. The source code and trained models are available online.The paper introduces Res2Net, a novel building block for convolutional neural networks (CNNs) that enhances multi-scale feature representation at a granular level. Unlike existing methods that improve multi-scale representation at the layer-wise level, Res2Net increases the range of receptive fields for each network layer by constructing hierarchical residual-like connections within a single residual block. The proposed block can be integrated into state-of-the-art backbone CNN models such as ResNet, ResNeXt, and DLA. Experimental results on datasets like CIFAR-100 and ImageNet demonstrate consistent performance gains over baseline models. Ablation studies and evaluations on tasks such as object detection, class activation mapping, and salient object detection further validate the superiority of Res2Net over state-of-the-art methods. The source code and trained models are available online.