25 Jun 2024 | Ashwinee Panda, Berivan Isik, Xiangyu Qi, Sanmi Koyejo, Tsachy Weissman, Prateek Mittal
The paper introduces Lottery Ticket Adaptation (LoTA), a sparse adaptation method designed to address the issue of destructive interference in multi-task adaptation of large language models (LLMs). LoTA identifies and optimizes only a sparse subnetwork of the model, freezing most parameters to prevent interference between tasks. This approach mitigates challenges such as catastrophic forgetting and improves performance on various tasks, including instruction following, reasoning, math, and summarization. LoTA outperforms full fine-tuning (FFT) and low-rank adaptation (LoRA) while maintaining good performance even after training on other tasks. The method also enables model merging across highly dissimilar tasks, demonstrating its effectiveness in practical applications. The authors provide a detailed evaluation of LoTA on multiple datasets and tasks, showing its superior performance and robustness compared to existing methods.The paper introduces Lottery Ticket Adaptation (LoTA), a sparse adaptation method designed to address the issue of destructive interference in multi-task adaptation of large language models (LLMs). LoTA identifies and optimizes only a sparse subnetwork of the model, freezing most parameters to prevent interference between tasks. This approach mitigates challenges such as catastrophic forgetting and improves performance on various tasks, including instruction following, reasoning, math, and summarization. LoTA outperforms full fine-tuning (FFT) and low-rank adaptation (LoRA) while maintaining good performance even after training on other tasks. The method also enables model merging across highly dissimilar tasks, demonstrating its effectiveness in practical applications. The authors provide a detailed evaluation of LoTA on multiple datasets and tasks, showing its superior performance and robustness compared to existing methods.