PocketLLM: Enabling On-Device Fine-Tuning for Personalized LLMs

PocketLLM: Enabling On-Device Fine-Tuning for Personalized LLMs

1 Jul 2024 | Dan Peng, Zhihui Fu, Jun Wang
PocketLLM enables on-device fine-tuning of large language models (LLMs) for personalized models on mobile devices, addressing privacy concerns by keeping data local. Traditional derivative-based optimization methods are memory-intensive, making them unsuitable for mobile devices. To overcome this, the paper proposes derivative-free optimization techniques, which reduce memory usage by avoiding the need to store gradients and optimizer states. Experiments show that the RoBERTa-large and OPT-1.3B models can be fine-tuned on the OPPO Reno 6 smartphone using approximately 4GB and 6.5GB of memory, respectively, demonstrating the feasibility of on-device LLM fine-tuning. This approach ensures data privacy and enables personalized models on resource-constrained devices. However, the method has limitations, including high memory requirements and slower convergence compared to derivative-based methods. Future work aims to improve efficiency and better utilize hardware capabilities. The study highlights the potential of derivative-free optimization for on-device LLM fine-tuning, paving the way for more efficient and privacy-preserving applications on mobile devices.PocketLLM enables on-device fine-tuning of large language models (LLMs) for personalized models on mobile devices, addressing privacy concerns by keeping data local. Traditional derivative-based optimization methods are memory-intensive, making them unsuitable for mobile devices. To overcome this, the paper proposes derivative-free optimization techniques, which reduce memory usage by avoiding the need to store gradients and optimizer states. Experiments show that the RoBERTa-large and OPT-1.3B models can be fine-tuned on the OPPO Reno 6 smartphone using approximately 4GB and 6.5GB of memory, respectively, demonstrating the feasibility of on-device LLM fine-tuning. This approach ensures data privacy and enables personalized models on resource-constrained devices. However, the method has limitations, including high memory requirements and slower convergence compared to derivative-based methods. Future work aims to improve efficiency and better utilize hardware capabilities. The study highlights the potential of derivative-free optimization for on-device LLM fine-tuning, paving the way for more efficient and privacy-preserving applications on mobile devices.
Reach us at info@study.space