September 2022 | Meng-Hao Guo, Tian-Xing Xu, Jiang-Jiang Liu, Zheng-Ning Liu, Peng-Tao Jiang, Tai-Jiang Mu, Song-Hai Zhang, Ralph R. Martin, Ming-Ming Cheng, Shi-Min Hu
This paper provides a comprehensive review of attention mechanisms in computer vision, categorizing them into four main categories: channel attention, spatial attention, temporal attention, and branch attention. The authors also introduce two hybrid categories: channel & spatial attention and spatial & temporal attention. The paper discusses the development of attention mechanisms, from early methods like RAM and STN to more recent self-attention models such as Vision Transformers. Key contributions include:
1. **Systematic Review**: A detailed overview of visual attention methods, including their unified description, development, and current research.
2. **Categorization**: Grouping attention methods based on their data domain to facilitate independent evaluation.
3. **Future Directions**: Suggestions for future research in visual attention.
The paper highlights the importance of attention mechanisms in various visual tasks, such as image classification, object detection, semantic segmentation, and video understanding. It also discusses the evolution of attention mechanisms, from RNN-based approaches to self-attention models, and the challenges and advancements in each phase. The authors provide a detailed analysis of representative works in each category, including their formulations, key contributions, and applications. Additionally, the paper compares this survey with existing surveys on attention methods and visual transformers, emphasizing its focus on a broader range of attention mechanisms and their data domains.This paper provides a comprehensive review of attention mechanisms in computer vision, categorizing them into four main categories: channel attention, spatial attention, temporal attention, and branch attention. The authors also introduce two hybrid categories: channel & spatial attention and spatial & temporal attention. The paper discusses the development of attention mechanisms, from early methods like RAM and STN to more recent self-attention models such as Vision Transformers. Key contributions include:
1. **Systematic Review**: A detailed overview of visual attention methods, including their unified description, development, and current research.
2. **Categorization**: Grouping attention methods based on their data domain to facilitate independent evaluation.
3. **Future Directions**: Suggestions for future research in visual attention.
The paper highlights the importance of attention mechanisms in various visual tasks, such as image classification, object detection, semantic segmentation, and video understanding. It also discusses the evolution of attention mechanisms, from RNN-based approaches to self-attention models, and the challenges and advancements in each phase. The authors provide a detailed analysis of representative works in each category, including their formulations, key contributions, and applications. Additionally, the paper compares this survey with existing surveys on attention methods and visual transformers, emphasizing its focus on a broader range of attention mechanisms and their data domains.