18 Mar 2024 | Yuan Shi, Bin Xia, Xiaoyu Jin, Xing Wang, Tianyu Zhao, Xin Xia, Xuefeng Xiao, and Wenming Yang
VmambaIR is a novel image restoration model that leverages state space models (SSMs) with linear complexity to address challenges in image restoration tasks. The model introduces an Omni Selective Scan (OSS) block, which combines an OSS module and an Efficient Feed-Forward Network (EFFN) to efficiently model image information flows in all six directions. This mechanism overcomes the unidirectional modeling limitation of SSMs and enables comprehensive pattern recognition and modeling of image data. VmambaIR is designed with a multi-scale UNet architecture, incorporating the OSS block to effectively capture multi-scale features of images. The model is evaluated on multiple image restoration tasks, including image deraining, single image super-resolution, and real-world image super-resolution. Experimental results show that VmambaIR achieves state-of-the-art performance with significantly fewer computational resources and parameters compared to existing methods. The model demonstrates superior performance in terms of accuracy and efficiency, particularly in real-world super-resolution tasks, where it achieves higher reconstruction accuracy with only 26% of the computational cost. VmambaIR's contributions include the development of a comprehensive image restoration model based on SSMs, the design of an OSS block that enhances image restoration capabilities, and the proposal of an omni selective scan mechanism that enables efficient modeling of image information flows. The model's effectiveness is validated through extensive experiments on various image restoration tasks, demonstrating its potential as a promising alternative to transformer and CNN architectures in image restoration.VmambaIR is a novel image restoration model that leverages state space models (SSMs) with linear complexity to address challenges in image restoration tasks. The model introduces an Omni Selective Scan (OSS) block, which combines an OSS module and an Efficient Feed-Forward Network (EFFN) to efficiently model image information flows in all six directions. This mechanism overcomes the unidirectional modeling limitation of SSMs and enables comprehensive pattern recognition and modeling of image data. VmambaIR is designed with a multi-scale UNet architecture, incorporating the OSS block to effectively capture multi-scale features of images. The model is evaluated on multiple image restoration tasks, including image deraining, single image super-resolution, and real-world image super-resolution. Experimental results show that VmambaIR achieves state-of-the-art performance with significantly fewer computational resources and parameters compared to existing methods. The model demonstrates superior performance in terms of accuracy and efficiency, particularly in real-world super-resolution tasks, where it achieves higher reconstruction accuracy with only 26% of the computational cost. VmambaIR's contributions include the development of a comprehensive image restoration model based on SSMs, the design of an OSS block that enhances image restoration capabilities, and the proposal of an omni selective scan mechanism that enables efficient modeling of image information flows. The model's effectiveness is validated through extensive experiments on various image restoration tasks, demonstrating its potential as a promising alternative to transformer and CNN architectures in image restoration.