[slides and audio] Coordinate Attention for Efficient Mobile Network Design

This paper introduces a novel attention mechanism called "coordinate attention" for mobile networks, which aims to enhance the performance of these networks by incorporating positional information into channel attention. Unlike traditional channel attention methods that use 2D global pooling to transform a feature tensor into a single vector, coordinate attention factorizes this process into two 1D encoding steps, allowing for the capture of long-range dependencies along one spatial direction and precise positional information along the other. This approach results in the generation of direction-aware and position-sensitive attention maps, which can be applied to the input feature map to emphasize regions of interest. The proposed method is simple, lightweight, and can be easily integrated into classic mobile network architectures such as MobileNetV2, MobileNetXt, and EfficientNet, with minimal computational overhead. Extensive experiments demonstrate that coordinate attention not only improves ImageNet classification accuracy but also performs better in downstream tasks like object detection and semantic segmentation. The code for the method is available at <https://github.com/Andrew-Qibin/CoordAttention>.This paper introduces a novel attention mechanism called "coordinate attention" for mobile networks, which aims to enhance the performance of these networks by incorporating positional information into channel attention. Unlike traditional channel attention methods that use 2D global pooling to transform a feature tensor into a single vector, coordinate attention factorizes this process into two 1D encoding steps, allowing for the capture of long-range dependencies along one spatial direction and precise positional information along the other. This approach results in the generation of direction-aware and position-sensitive attention maps, which can be applied to the input feature map to emphasize regions of interest. The proposed method is simple, lightweight, and can be easily integrated into classic mobile network architectures such as MobileNetV2, MobileNetXt, and EfficientNet, with minimal computational overhead. Extensive experiments demonstrate that coordinate attention not only improves ImageNet classification accuracy but also performs better in downstream tasks like object detection and semantic segmentation. The code for the method is available at <https://github.com/Andrew-Qibin/CoordAttention>.

Coordinate Attention for Efficient Mobile Network Design

4 Mar 2021 | Qibin Hou1 Daquan Zhou1 Jiashi Feng1