ELA: Efficient Local Attention for Deep Convolutional Neural Networks

ELA: Efficient Local Attention for Deep Convolutional Neural Networks

2 Mar 2024 | Wei Xu1,2,3 and Yi Wan1
The paper introduces an Efficient Local Attention (ELA) method to enhance the performance of deep convolutional neural networks (CNNs) by effectively localizing regions of interest without reducing channel dimensions or increasing complexity. ELA addresses the limitations of existing attention mechanisms, such as Coordinate Attention (CA), which suffer from poor generalization, adverse effects of dimension reduction, and complex generation processes. ELA incorporates 1D convolution and Group Normalization to accurately encode two 1D positional feature maps, enabling precise localization while maintaining lightweight implementation. The method is designed with three hyperparameters, resulting in four versions (ELA-T, ELA-B, ELA-S, and ELA-L) to suit different visual tasks. Extensive evaluations on datasets like ImageNet, MSCOCO, and Pascal VOC demonstrate that ELA outperforms state-of-the-art attention methods in image classification, object detection, and semantic segmentation, showing superior performance with fewer parameters and lower computational complexity.The paper introduces an Efficient Local Attention (ELA) method to enhance the performance of deep convolutional neural networks (CNNs) by effectively localizing regions of interest without reducing channel dimensions or increasing complexity. ELA addresses the limitations of existing attention mechanisms, such as Coordinate Attention (CA), which suffer from poor generalization, adverse effects of dimension reduction, and complex generation processes. ELA incorporates 1D convolution and Group Normalization to accurately encode two 1D positional feature maps, enabling precise localization while maintaining lightweight implementation. The method is designed with three hyperparameters, resulting in four versions (ELA-T, ELA-B, ELA-S, and ELA-L) to suit different visual tasks. Extensive evaluations on datasets like ImageNet, MSCOCO, and Pascal VOC demonstrate that ELA outperforms state-of-the-art attention methods in image classification, object detection, and semantic segmentation, showing superior performance with fewer parameters and lower computational complexity.
Reach us at info@study.space
[slides] ELA%3A Efficient Local Attention for Deep Convolutional Neural Networks | StudySpace