FusionMamba: Dynamic Feature Enhancement for Multimodal Image Fusion with Mamba

FusionMamba: Dynamic Feature Enhancement for Multimodal Image Fusion with Mamba

2017-07-17 | Xinyu Xie, Yawen Cui, Chio-In IEONG, Tao Tan, Xiaozhi Zhang, Xubin Zheng, Zitong Yu
FusionMamba is a novel dynamic feature enhancement method for multimodal image fusion using the Mamba framework. The method integrates an improved efficient Mamba model with dynamic convolution and channel attention to enhance both global modeling capabilities and local feature extraction. It also introduces a dynamic feature fusion module (DFFM) that includes two dynamic feature enhancement modules (DFEM) and a cross-modality fusion Mamba module (CMFM). The DFEM is used for dynamic texture enhancement and difference perception, while the CMFM enhances correlation features between modalities and suppresses redundant inter-modal information. FusionMamba has achieved state-of-the-art performance in various multimodal image fusion tasks, including CT-MRI, PET-MRI, SPECT-MRI, infrared and visible image fusion, and the GFP-PC dataset, demonstrating its generalization ability. The model's efficiency is further enhanced by the use of dynamic feature enhancement and cross-modal fusion strategies, which allow it to capture long-range dependencies with linear complexity. The model's performance is evaluated using six key metrics, including structural fidelity, structure content difference, multiscale structural similarity index measure, gradient-based metric, feature mutual information, and visual information fidelity. FusionMamba outperforms other methods across these metrics, indicating superior fusion performance in retaining functional and morphological information. The model's computational efficiency is also demonstrated through a comparison with CNN and Transformer-based methods, showing lower FLOPs and average runtime. Ablation experiments confirm the effectiveness of the proposed model's components, including the DVSS module, DFFM, and CMFM. The results validate the generalization ability of the proposed method and its potential for real-time applications and deployment on resource-constrained devices.FusionMamba is a novel dynamic feature enhancement method for multimodal image fusion using the Mamba framework. The method integrates an improved efficient Mamba model with dynamic convolution and channel attention to enhance both global modeling capabilities and local feature extraction. It also introduces a dynamic feature fusion module (DFFM) that includes two dynamic feature enhancement modules (DFEM) and a cross-modality fusion Mamba module (CMFM). The DFEM is used for dynamic texture enhancement and difference perception, while the CMFM enhances correlation features between modalities and suppresses redundant inter-modal information. FusionMamba has achieved state-of-the-art performance in various multimodal image fusion tasks, including CT-MRI, PET-MRI, SPECT-MRI, infrared and visible image fusion, and the GFP-PC dataset, demonstrating its generalization ability. The model's efficiency is further enhanced by the use of dynamic feature enhancement and cross-modal fusion strategies, which allow it to capture long-range dependencies with linear complexity. The model's performance is evaluated using six key metrics, including structural fidelity, structure content difference, multiscale structural similarity index measure, gradient-based metric, feature mutual information, and visual information fidelity. FusionMamba outperforms other methods across these metrics, indicating superior fusion performance in retaining functional and morphological information. The model's computational efficiency is also demonstrated through a comparison with CNN and Transformer-based methods, showing lower FLOPs and average runtime. Ablation experiments confirm the effectiveness of the proposed model's components, including the DVSS module, DFFM, and CMFM. The results validate the generalization ability of the proposed method and its potential for real-time applications and deployment on resource-constrained devices.
Reach us at info@study.space
[slides] FusionMamba%3A Dynamic Feature Enhancement for Multimodal Image Fusion with Mamba | StudySpace