Task-Customized Mixture of Adapters for General Image Fusion

Task-Customized Mixture of Adapters for General Image Fusion

24 Mar 2024 | Pengfei Zhu, Yang Sun, Bing Cao*, Qinghua Hu
This paper proposes a task-customized mixture of adapters (TC-MoA) for general image fusion. TC-MoA is a novel parameter-efficient fine-tuning method that adaptively prompts various fusion tasks within a unified model. The method is inspired by the mixture of experts (MoE) approach, where each expert serves as an efficient tuning adapter to adaptively prompt a pre-trained foundation model. These adapters are shared across different tasks and constrained by mutual information regularization, ensuring compatibility with different tasks while complementing multi-source images. Task-specific routing networks customize these adapters to extract task-specific information from different sources with dynamic dominant intensity, performing adaptive visual feature prompt fusion. Notably, TC-MoA controls the dominant intensity bias for different fusion tasks, successfully unifying multiple fusion tasks in a single model. Extensive experiments show that TC-MoA outperforms the competing approaches in learning commonalities while retaining compatibility for general image fusion (multi-modal, multi-exposure, and multifocus), and also demonstrating striking controllability on more generalization experiments. The code is available at https://github.com/YangSun22/TC-MoA.This paper proposes a task-customized mixture of adapters (TC-MoA) for general image fusion. TC-MoA is a novel parameter-efficient fine-tuning method that adaptively prompts various fusion tasks within a unified model. The method is inspired by the mixture of experts (MoE) approach, where each expert serves as an efficient tuning adapter to adaptively prompt a pre-trained foundation model. These adapters are shared across different tasks and constrained by mutual information regularization, ensuring compatibility with different tasks while complementing multi-source images. Task-specific routing networks customize these adapters to extract task-specific information from different sources with dynamic dominant intensity, performing adaptive visual feature prompt fusion. Notably, TC-MoA controls the dominant intensity bias for different fusion tasks, successfully unifying multiple fusion tasks in a single model. Extensive experiments show that TC-MoA outperforms the competing approaches in learning commonalities while retaining compatibility for general image fusion (multi-modal, multi-exposure, and multifocus), and also demonstrating striking controllability on more generalization experiments. The code is available at https://github.com/YangSun22/TC-MoA.
Reach us at info@study.space