A Survey on Visual Mamba

A Survey on Visual Mamba

26 Apr 2024 | Hanwei Zhang, Ying Zhu, Dan Wang, Lijun Zhang, Tianxiang Chen, and Zi Ye
This paper provides a comprehensive survey of Mamba models in the field of computer vision. Mamba, a state space model (SSM) with selection mechanisms and hardware-aware architectures, has shown significant promise in long-sequence modeling. The paper begins by exploring the foundational concepts of Mamba, including its state space model framework, selection mechanisms, and hardware-aware design. It then reviews various vision Mamba models, categorizing them into foundational models and enhancing them with techniques such as convolution, recurrence, and attention to improve their sophistication. The paper further delves into the widespread applications of Mamba in vision tasks, including general visual tasks, medical visual tasks (e.g., 2D/3D segmentation, classification, and image registration), and remote sensing visual tasks. The survey is structured into sections that cover the formulation of Mamba, its integration with other architectures, and its applications in different visual tasks. The paper aims to serve as a guide for researchers interested in deepening their understanding of Mamba models in computer vision.This paper provides a comprehensive survey of Mamba models in the field of computer vision. Mamba, a state space model (SSM) with selection mechanisms and hardware-aware architectures, has shown significant promise in long-sequence modeling. The paper begins by exploring the foundational concepts of Mamba, including its state space model framework, selection mechanisms, and hardware-aware design. It then reviews various vision Mamba models, categorizing them into foundational models and enhancing them with techniques such as convolution, recurrence, and attention to improve their sophistication. The paper further delves into the widespread applications of Mamba in vision tasks, including general visual tasks, medical visual tasks (e.g., 2D/3D segmentation, classification, and image registration), and remote sensing visual tasks. The survey is structured into sections that cover the formulation of Mamba, its integration with other architectures, and its applications in different visual tasks. The paper aims to serve as a guide for researchers interested in deepening their understanding of Mamba models in computer vision.
Reach us at info@study.space
[slides and audio] A Survey on Visual Mamba