MambaDFuse: A Mamba-based Dual-phase Model for Multi-modality Image Fusion

MambaDFuse: A Mamba-based Dual-phase Model for Multi-modality Image Fusion

12 Apr 2024 | Zhe Li, Haiwei Pan, Kejia Zhang, Yuhua Wang, Fengming Yu
MambaDFuse is a Mamba-based dual-phase model for multi-modality image fusion, designed to address the limitations of existing methods in extracting modality-specific and modality-fused features. The model consists of three stages: dual-level feature extraction, dual-phase feature fusion, and fused image reconstruction. The dual-level feature extractor combines CNNs and Mamba blocks to capture low and high-level features from single-modality images. The dual-phase feature fusion module uses channel exchange for shallow fusion and enhanced Multi-modal Mamba (M3) blocks for deep fusion to combine complementary information from different modalities. The fused image reconstruction module uses inverse transformation to generate the fused result. MambaDFuse achieves promising results in infrared-visible and medical image fusion, and demonstrates improved performance in downstream tasks such as object detection. The model's efficiency and effectiveness are validated through extensive experiments on benchmark datasets, showing superior performance in both quantitative and qualitative evaluations. The proposed method enhances the fusion of multi-modal information by leveraging the strengths of Mamba blocks, making it a powerful tool for multi-modality image fusion.MambaDFuse is a Mamba-based dual-phase model for multi-modality image fusion, designed to address the limitations of existing methods in extracting modality-specific and modality-fused features. The model consists of three stages: dual-level feature extraction, dual-phase feature fusion, and fused image reconstruction. The dual-level feature extractor combines CNNs and Mamba blocks to capture low and high-level features from single-modality images. The dual-phase feature fusion module uses channel exchange for shallow fusion and enhanced Multi-modal Mamba (M3) blocks for deep fusion to combine complementary information from different modalities. The fused image reconstruction module uses inverse transformation to generate the fused result. MambaDFuse achieves promising results in infrared-visible and medical image fusion, and demonstrates improved performance in downstream tasks such as object detection. The model's efficiency and effectiveness are validated through extensive experiments on benchmark datasets, showing superior performance in both quantitative and qualitative evaluations. The proposed method enhances the fusion of multi-modal information by leveraging the strengths of Mamba blocks, making it a powerful tool for multi-modality image fusion.
Reach us at info@study.space
[slides] MambaDFuse%3A A Mamba-based Dual-phase Model for Multi-modality Image Fusion | StudySpace