ResNeSt: Split-Attention Networks

ResNeSt: Split-Attention Networks

30 Dec 2020 | Hang Zhang, Chongruo Wu, Zhongyue Zhang, Yi Zhu, Haibin Lin, Zhi Zhang, Yue Sun, Tong He, Jonas Mueller, R. Manmatha, Mu Li, Alexander Smola
ResNeSt is a novel architecture that integrates channel-wise attention with multi-path representation into a unified Split-Attention block. This design simplifies computation and improves performance in image classification, object detection, instance segmentation, and semantic segmentation. The model outperforms EfficientNet in accuracy and latency trade-off on image classification tasks. ResNeSt has achieved superior transfer learning results on several benchmarks and has been adopted by winning entries in the COCO-LVIS challenge. The architecture uses a modularized design with a Split-Attention block that enables featuremap attention across different groups. The block is parameterized with few variables and can be efficiently implemented using standard CNN operators. ResNeSt has been tested on ImageNet, achieving better speed-accuracy trade-offs than state-of-the-art CNN models. It also performs well in object detection, instance segmentation, and semantic segmentation tasks, achieving high accuracy on benchmarks such as MS-COCO and ADE20K. The model is efficient, with a 32% lower latency than EfficientNet-B7 while maintaining better accuracy. ResNeSt is easy to implement and transfer well across different vision tasks. It has been adopted by multiple winning entries in the 2020 COCO-LVIS and 2020 DAVIS-VOS challenges.ResNeSt is a novel architecture that integrates channel-wise attention with multi-path representation into a unified Split-Attention block. This design simplifies computation and improves performance in image classification, object detection, instance segmentation, and semantic segmentation. The model outperforms EfficientNet in accuracy and latency trade-off on image classification tasks. ResNeSt has achieved superior transfer learning results on several benchmarks and has been adopted by winning entries in the COCO-LVIS challenge. The architecture uses a modularized design with a Split-Attention block that enables featuremap attention across different groups. The block is parameterized with few variables and can be efficiently implemented using standard CNN operators. ResNeSt has been tested on ImageNet, achieving better speed-accuracy trade-offs than state-of-the-art CNN models. It also performs well in object detection, instance segmentation, and semantic segmentation tasks, achieving high accuracy on benchmarks such as MS-COCO and ADE20K. The model is efficient, with a 32% lower latency than EfficientNet-B7 while maintaining better accuracy. ResNeSt is easy to implement and transfer well across different vision tasks. It has been adopted by multiple winning entries in the 2020 COCO-LVIS and 2020 DAVIS-VOS challenges.
Reach us at info@study.space
[slides and audio] ResNeSt%3A Split-Attention Networks