Chain of LoRA (COLA) is an iterative optimization framework inspired by the Frank-Wolfe algorithm for efficient fine-tuning of large language models (LLMs). COLA aims to bridge the gap between Low-Rank Adaptation (LoRA) and full parameter fine-tuning without incurring additional computational or memory costs. It employs a residual learning procedure where learned LoRA modules are merged into pre-trained model parameters and new LoRA modules are reinitialized. Theoretical convergence guarantees and empirical results show that COLA consistently outperforms LoRA across various models (OPT and LLaMA-2) and seven benchmarking tasks. COLA achieves this by iteratively learning a sequence of low-rank matrix decompositions to approximate a high-rank augmentation, thereby improving generalization error. The method is efficient, with no additional computational or memory overhead, and can be applied to a wide range of tasks. Theoretical analysis demonstrates that COLA converges to stationary points in nonconvex optimization settings. Experimental results show that COLA achieves significant improvements in test accuracy compared to LoRA, with relative gains of up to 6.47% on the WSC task and 4.4% on the LLaMA-2-7B task. COLA also demonstrates robustness in terms of generalization error, with smaller standard deviations compared to LoRA. The method is flexible, allowing for rank step-down configurations to reduce computational cost while maintaining performance. Overall, COLA offers a more efficient and effective approach to parameter-efficient fine-tuning of large language models.Chain of LoRA (COLA) is an iterative optimization framework inspired by the Frank-Wolfe algorithm for efficient fine-tuning of large language models (LLMs). COLA aims to bridge the gap between Low-Rank Adaptation (LoRA) and full parameter fine-tuning without incurring additional computational or memory costs. It employs a residual learning procedure where learned LoRA modules are merged into pre-trained model parameters and new LoRA modules are reinitialized. Theoretical convergence guarantees and empirical results show that COLA consistently outperforms LoRA across various models (OPT and LLaMA-2) and seven benchmarking tasks. COLA achieves this by iteratively learning a sequence of low-rank matrix decompositions to approximate a high-rank augmentation, thereby improving generalization error. The method is efficient, with no additional computational or memory overhead, and can be applied to a wide range of tasks. Theoretical analysis demonstrates that COLA converges to stationary points in nonconvex optimization settings. Experimental results show that COLA achieves significant improvements in test accuracy compared to LoRA, with relative gains of up to 6.47% on the WSC task and 4.4% on the LLaMA-2-7B task. COLA also demonstrates robustness in terms of generalization error, with smaller standard deviations compared to LoRA. The method is flexible, allowing for rank step-down configurations to reduce computational cost while maintaining performance. Overall, COLA offers a more efficient and effective approach to parameter-efficient fine-tuning of large language models.