[slides] PocketLLM%3A Enabling On-Device Fine-Tuning for Personalized LLMs

The paper "PocketLLM: Enabling On-Device Fine-Tuning for Personalized LLMs" by Dan Peng addresses the challenge of fine-tuning large language models (LLMs) on mobile devices while maintaining data privacy. The authors propose using derivative-free optimization techniques to enable on-device fine-tuning, which is particularly useful for resource-constrained devices like smartphones. Traditional derivative-based optimization methods are memory-intensive and often fail on such devices due to limited memory. The proposed approach, using zeroth-order optimization (MeZo), reduces memory consumption by bypassing the need to compute and store gradients and optimizer states. Experiments on the OPPO Reno 6 smartphone demonstrate that MeZo can successfully fine-tune the RoBERTa-large and OPT-1.3B models with approximately 4GB and 6.5GB of memory, respectively, without incurring out-of-memory errors. The results show that while MeZo may converge more slowly than derivative-based methods, it is more memory-efficient and suitable for mobile devices. The paper also highlights the significant gap in computational power between mobile devices and GPUs, emphasizing the need for further improvements in derivative-free optimization methods to reduce training times. The authors conclude that their approach enables personalized LLMs on resource-constrained devices while maintaining user data privacy. However, they acknowledge limitations such as high memory requirements, the efficiency of derivative-free methods, and the need to fully utilize hardware capabilities. Future work will focus on minimizing memory footprints, improving the efficiency of derivative-free optimization, and adapting these methods to leverage the computational power of current mobile devices.The paper "PocketLLM: Enabling On-Device Fine-Tuning for Personalized LLMs" by Dan Peng addresses the challenge of fine-tuning large language models (LLMs) on mobile devices while maintaining data privacy. The authors propose using derivative-free optimization techniques to enable on-device fine-tuning, which is particularly useful for resource-constrained devices like smartphones. Traditional derivative-based optimization methods are memory-intensive and often fail on such devices due to limited memory. The proposed approach, using zeroth-order optimization (MeZo), reduces memory consumption by bypassing the need to compute and store gradients and optimizer states. Experiments on the OPPO Reno 6 smartphone demonstrate that MeZo can successfully fine-tune the RoBERTa-large and OPT-1.3B models with approximately 4GB and 6.5GB of memory, respectively, without incurring out-of-memory errors. The results show that while MeZo may converge more slowly than derivative-based methods, it is more memory-efficient and suitable for mobile devices. The paper also highlights the significant gap in computational power between mobile devices and GPUs, emphasizing the need for further improvements in derivative-free optimization methods to reduce training times. The authors conclude that their approach enables personalized LLMs on resource-constrained devices while maintaining user data privacy. However, they acknowledge limitations such as high memory requirements, the efficiency of derivative-free methods, and the need to fully utilize hardware capabilities. Future work will focus on minimizing memory footprints, improving the efficiency of derivative-free optimization, and adapting these methods to leverage the computational power of current mobile devices.

PocketLLM: Enabling On-Device Fine-Tuning for Personalized LLMs

1 Jul 2024 | Dan Peng, Zhihui Fu, Jun Wang