18 May 2024 | Oleksiy Ostapenko, Zhan Su, Edoardo Maria Ponti, Laurent Charlin, Nicolas Le Roux, Matheus Pereira, Lucas Caccia, Alessandro Sordoni
This paper presents a method for building and reusing a library of LoRA (Low-Rank Adaptation) adapters to improve the performance of large language models (LLMs) on new tasks. The key contributions include the development of Model-Based Clustering (MBC), a technique that groups tasks based on the similarity of their LoRA parameters, and Arrow, a zero-shot routing mechanism that dynamically selects the most relevant adapters for new inputs without requiring retraining. The authors evaluate their approach on a wide range of tasks, demonstrating that MBC-based adapters and Arrow routing lead to superior generalization to new tasks. They also compare their method with existing approaches and show that their method achieves better performance than traditional joint training. The paper highlights the potential of modular, adaptable LLMs that can match or outperform traditional joint training. The authors propose a two-stage training procedure for building the LoRA library, where the first stage involves training private LoRAs and the second stage involves clustering tasks and training one adapter per cluster. They also present several routing strategies for reusing the library, including zero-shot routing and supervised task routing. The results show that Arrow routing outperforms other routing approaches in the case of private libraries, while in the case of MBC libraries, routing is less important. The paper also discusses the broader implications of their work, including the potential for more efficient and flexible LLMs that can be trained and reused in a decentralized manner.This paper presents a method for building and reusing a library of LoRA (Low-Rank Adaptation) adapters to improve the performance of large language models (LLMs) on new tasks. The key contributions include the development of Model-Based Clustering (MBC), a technique that groups tasks based on the similarity of their LoRA parameters, and Arrow, a zero-shot routing mechanism that dynamically selects the most relevant adapters for new inputs without requiring retraining. The authors evaluate their approach on a wide range of tasks, demonstrating that MBC-based adapters and Arrow routing lead to superior generalization to new tasks. They also compare their method with existing approaches and show that their method achieves better performance than traditional joint training. The paper highlights the potential of modular, adaptable LLMs that can match or outperform traditional joint training. The authors propose a two-stage training procedure for building the LoRA library, where the first stage involves training private LoRAs and the second stage involves clustering tasks and training one adapter per cluster. They also present several routing strategies for reusing the library, including zero-shot routing and supervised task routing. The results show that Arrow routing outperforms other routing approaches in the case of private libraries, while in the case of MBC libraries, routing is less important. The paper also discusses the broader implications of their work, including the potential for more efficient and flexible LLMs that can be trained and reused in a decentralized manner.