[slides] Heterogeneous LoRA for Federated Fine-tuning of On-Device Foundation Models

The paper "Heterogeneous LoRA for Federated Fine-tuning of On-Device Foundation Models" addresses the challenge of fine-tuning large foundation models (FMs) on devices with limited resources, a scenario known as on-device foundation models (ODFMs). The authors propose a novel method called HETLoRA (Heterogeneous Low-Rank Approximations) to tackle the data and system heterogeneity issues in federated learning (FL). HETLoRA allows for the use of different ranks across client devices, which helps in balancing overfitting and convergence speed. The method involves rank self-pruning locally and sparsity-weighted aggregation at the server, combining the benefits of high and low-rank LoRAs. The paper demonstrates that HETLoRA outperforms homogeneous LoRA in terms of training speed, communication/computation efficiency, and final performance. The authors also provide experimental results showing the effectiveness of HETLoRA on real-world datasets, such as multi-session chat and text summarization tasks. The contributions of the paper include the introduction of HETLoRA, its evaluation on various tasks, and a discussion on the practical implications and future directions.The paper "Heterogeneous LoRA for Federated Fine-tuning of On-Device Foundation Models" addresses the challenge of fine-tuning large foundation models (FMs) on devices with limited resources, a scenario known as on-device foundation models (ODFMs). The authors propose a novel method called HETLoRA (Heterogeneous Low-Rank Approximations) to tackle the data and system heterogeneity issues in federated learning (FL). HETLoRA allows for the use of different ranks across client devices, which helps in balancing overfitting and convergence speed. The method involves rank self-pruning locally and sparsity-weighted aggregation at the server, combining the benefits of high and low-rank LoRAs. The paper demonstrates that HETLoRA outperforms homogeneous LoRA in terms of training speed, communication/computation efficiency, and final performance. The authors also provide experimental results showing the effectiveness of HETLoRA on real-world datasets, such as multi-session chat and text summarization tasks. The contributions of the paper include the introduction of HETLoRA, its evaluation on various tasks, and a discussion on the practical implications and future directions.

Heterogeneous LoRA for Federated Fine-tuning of On-Device Foundation Models

20 Feb 2024 | Yae Jee Cho, Luyang Liu, Zheng Xu, Aldi Fahrezi, Gauri Joshi