CBAM: Convolutional Block Attention Module

CBAM: Convolutional Block Attention Module

18 Jul 2018 | Sanghyun Woo*1, Jongchan Park*†2, Joon-Young Lee3, and In So Kweon1
The Convolutional Block Attention Module (CBAM) is a simple yet effective attention module for feed-forward convolutional neural networks (CNNs). Given an intermediate feature map, CBAM sequentially infers attention maps along two dimensions: channel and spatial. These attention maps are then multiplied with the input feature map to adaptively refine features. CBAM is lightweight and general, allowing seamless integration into any CNN architecture with minimal overhead and end-to-end trainability. Extensive experiments on ImageNet-1K, MS COCO detection, and VOC 2007 detection datasets show consistent improvements in classification and detection performance across various models, demonstrating the wide applicability of CBAM. CBAM consists of two sequential sub-modules: channel and spatial attention. The channel attention module computes attention by exploiting inter-channel relationships, while the spatial attention module focuses on spatial relationships. Both modules use pooling operations to generate feature descriptors, which are then processed through shared networks to produce attention maps. The channel attention module uses both average-pooled and max-pooled features to enhance performance, while the spatial attention module uses convolution to generate a spatial attention map. Experiments show that CBAM improves performance on various benchmarks, including ImageNet-1K, MS COCO, and VOC 2007. The module is effective in both classification and detection tasks, with results showing that CBAM outperforms existing methods like Squeeze-and-Excitation (SE) in terms of accuracy. Visualization using Grad-CAM confirms that CBAM-enhanced networks focus on target objects more effectively. CBAM is also lightweight, with minimal parameter and computational overhead, making it suitable for deployment on low-end devices. The main contributions of this work include proposing a simple yet effective attention module (CBAM) that can be widely applied to boost CNN representation power, validating the effectiveness of the module through extensive ablation studies, and verifying that performance of various networks is greatly improved on multiple benchmarks by plugging in the lightweight module. The code and models are publicly available.The Convolutional Block Attention Module (CBAM) is a simple yet effective attention module for feed-forward convolutional neural networks (CNNs). Given an intermediate feature map, CBAM sequentially infers attention maps along two dimensions: channel and spatial. These attention maps are then multiplied with the input feature map to adaptively refine features. CBAM is lightweight and general, allowing seamless integration into any CNN architecture with minimal overhead and end-to-end trainability. Extensive experiments on ImageNet-1K, MS COCO detection, and VOC 2007 detection datasets show consistent improvements in classification and detection performance across various models, demonstrating the wide applicability of CBAM. CBAM consists of two sequential sub-modules: channel and spatial attention. The channel attention module computes attention by exploiting inter-channel relationships, while the spatial attention module focuses on spatial relationships. Both modules use pooling operations to generate feature descriptors, which are then processed through shared networks to produce attention maps. The channel attention module uses both average-pooled and max-pooled features to enhance performance, while the spatial attention module uses convolution to generate a spatial attention map. Experiments show that CBAM improves performance on various benchmarks, including ImageNet-1K, MS COCO, and VOC 2007. The module is effective in both classification and detection tasks, with results showing that CBAM outperforms existing methods like Squeeze-and-Excitation (SE) in terms of accuracy. Visualization using Grad-CAM confirms that CBAM-enhanced networks focus on target objects more effectively. CBAM is also lightweight, with minimal parameter and computational overhead, making it suitable for deployment on low-end devices. The main contributions of this work include proposing a simple yet effective attention module (CBAM) that can be widely applied to boost CNN representation power, validating the effectiveness of the module through extensive ablation studies, and verifying that performance of various networks is greatly improved on multiple benchmarks by plugging in the lightweight module. The code and models are publicly available.
Reach us at info@study.space
[slides] CBAM%3A Convolutional Block Attention Module | StudySpace