FusionMamba: Dynamic Feature Enhancement for Multimodal Image Fusion with Mamba

FusionMamba: Dynamic Feature Enhancement for Multimodal Image Fusion with Mamba

20 Apr 2024 | Xinyu Xie, Yawen Cui, Chio-In IEONG, Tao Tan, Xiaozhi Zhang, Xubin Zheng, and Zitong Yu
FusionMamba is a novel dynamic feature enhancement method for multimodal image fusion, integrating the Mamba framework with efficient visual state space models, dynamic convolution, and channel attention. The method aims to address the limitations of traditional convolutional neural networks (CNNs) and Transformer-based models in capturing global and local features, respectively. FusionMamba introduces a Dynamic Feature Fusion Module (DFFM) that includes two Dynamic Feature Enhancement Modules (DFEM) and a Cross Modality Fusion Mamba Module (CMFM). DFEM enhances texture detail and difference perception, while CMFM improves correlation features and suppresses redundant intermodal information. Experimental results demonstrate that FusionMamba outperforms state-of-the-art methods in various multimodal image fusion tasks, including CT-MRI, PET-MRI, SPECT-MRI, IR-VIS, and GFP-PC fusion, achieving superior performance in structural fidelity, content difference, feature information, and visual fidelity. The method's effectiveness is further validated through ablation studies and a computational cost analysis, showing its robustness and efficiency.FusionMamba is a novel dynamic feature enhancement method for multimodal image fusion, integrating the Mamba framework with efficient visual state space models, dynamic convolution, and channel attention. The method aims to address the limitations of traditional convolutional neural networks (CNNs) and Transformer-based models in capturing global and local features, respectively. FusionMamba introduces a Dynamic Feature Fusion Module (DFFM) that includes two Dynamic Feature Enhancement Modules (DFEM) and a Cross Modality Fusion Mamba Module (CMFM). DFEM enhances texture detail and difference perception, while CMFM improves correlation features and suppresses redundant intermodal information. Experimental results demonstrate that FusionMamba outperforms state-of-the-art methods in various multimodal image fusion tasks, including CT-MRI, PET-MRI, SPECT-MRI, IR-VIS, and GFP-PC fusion, achieving superior performance in structural fidelity, content difference, feature information, and visual fidelity. The method's effectiveness is further validated through ablation studies and a computational cost analysis, showing its robustness and efficiency.
Reach us at info@study.space