18 Feb 2024 | Hanqing Wang, Bowen Ping, Shuo Wang, Xu Han, Yun Chen, Zhiyuan Liu, Maosong Sun
**LoRA-Flow: Dynamic LoRA Fusion for Large Language Models in Generative Tasks**
This paper introduces LoRA-Flow, a dynamic fusion method for combining lightweight modules (LoRAs) to customize large language models (LLMs) for specific tasks or domains. LoRA-Flow addresses the limitation of existing LoRA fusion methods, which typically use task-level weights, by introducing dynamic weights that adjust the influence of different LoRAs at each step of the generation process. This approach is particularly beneficial for complex generative tasks, such as solving mathematical problems or generating code, where different tokens may require diverse skills.
**Key Contributions:**
1. **Dynamic Fusion Weights:** LoRA-Flow uses a fusion gate with a small number of parameters to dynamically determine the fusion weights for each token, allowing for context-specific adjustments.
2. **Performance Improvement:** Experiments on six generative tasks demonstrate that LoRA-Flow consistently outperforms baselines that use static task-level fusion weights.
3. **Flexibility and Generalization:** The method is shown to be effective across different languages and task combinations, highlighting its flexibility and generalization capabilities.
**Methodology:**
- **Fusion Gate:** The fusion gate projects the input hidden states into fusion weights, which are learned with only 200 training examples.
- **Fusion Weights Integration:** The learned fusion weights are integrated into the model to combine the outputs of different LoRAs, treating each LoRA as a complete module.
- **Training Algorithm:** The fusion gate is trained on a few-shot dataset to learn the dynamic fusion weights, while the model and LoRAs remain frozen.
**Experiments:**
- **Setup:** The experiments use Llama-2 as the base model and evaluate LoRA-Flow on tasks like Chinese math, Russian math, Spanish math, English code, and more.
- **Results:** LoRA-Flow consistently outperforms baselines, demonstrating the effectiveness of dynamic fusion weights in complex generative tasks.
- **Analysis:** Comprehensive analyses reveal that the fusion weights vary across different layers and time steps, providing insights into the behavior of LoRA-Flow.
**Conclusion:**
LoRA-Flow is a novel approach that enhances the reusability and adaptability of LLMs by dynamically combining LoRAs based on the current context. This method shows promise in improving the performance of LLMs on complex generative tasks, making it a valuable contribution to the field of parameter-efficient fine-tuning.**LoRA-Flow: Dynamic LoRA Fusion for Large Language Models in Generative Tasks**
This paper introduces LoRA-Flow, a dynamic fusion method for combining lightweight modules (LoRAs) to customize large language models (LLMs) for specific tasks or domains. LoRA-Flow addresses the limitation of existing LoRA fusion methods, which typically use task-level weights, by introducing dynamic weights that adjust the influence of different LoRAs at each step of the generation process. This approach is particularly beneficial for complex generative tasks, such as solving mathematical problems or generating code, where different tokens may require diverse skills.
**Key Contributions:**
1. **Dynamic Fusion Weights:** LoRA-Flow uses a fusion gate with a small number of parameters to dynamically determine the fusion weights for each token, allowing for context-specific adjustments.
2. **Performance Improvement:** Experiments on six generative tasks demonstrate that LoRA-Flow consistently outperforms baselines that use static task-level fusion weights.
3. **Flexibility and Generalization:** The method is shown to be effective across different languages and task combinations, highlighting its flexibility and generalization capabilities.
**Methodology:**
- **Fusion Gate:** The fusion gate projects the input hidden states into fusion weights, which are learned with only 200 training examples.
- **Fusion Weights Integration:** The learned fusion weights are integrated into the model to combine the outputs of different LoRAs, treating each LoRA as a complete module.
- **Training Algorithm:** The fusion gate is trained on a few-shot dataset to learn the dynamic fusion weights, while the model and LoRAs remain frozen.
**Experiments:**
- **Setup:** The experiments use Llama-2 as the base model and evaluate LoRA-Flow on tasks like Chinese math, Russian math, Spanish math, English code, and more.
- **Results:** LoRA-Flow consistently outperforms baselines, demonstrating the effectiveness of dynamic fusion weights in complex generative tasks.
- **Analysis:** Comprehensive analyses reveal that the fusion weights vary across different layers and time steps, providing insights into the behavior of LoRA-Flow.
**Conclusion:**
LoRA-Flow is a novel approach that enhances the reusability and adaptability of LLMs by dynamically combining LoRAs based on the current context. This method shows promise in improving the performance of LLMs on complex generative tasks, making it a valuable contribution to the field of parameter-efficient fine-tuning.