This paper presents a benchmark study on zeroth-order (ZO) optimization for memory-efficient fine-tuning of large language models (LLMs). As LLMs grow in size, the memory overhead of first-order (FO) gradient computation via back-propagation (BP) becomes a significant challenge. To address this, the authors propose ZO optimization, which avoids BP and reduces memory costs. They evaluate various ZO optimization methods across five LLM families, three task complexities, and five fine-tuning schemes, revealing previously overlooked optimization principles such as task alignment and the role of forward gradients. The study introduces novel enhancements like block-wise descent, hybrid ZO and FO training, and gradient sparsity to improve ZO-based LLM fine-tuning. The results show that ZO optimization can achieve competitive performance with FO methods while maintaining memory efficiency. The study also highlights the importance of task alignment and the trade-offs between algorithm complexity and fine-tuning performance. The findings suggest that ZO optimization can be a promising approach for memory-efficient LLM fine-tuning, particularly in resource-constrained environments. The authors provide a comprehensive benchmarking framework and code for reproducing their experiments.This paper presents a benchmark study on zeroth-order (ZO) optimization for memory-efficient fine-tuning of large language models (LLMs). As LLMs grow in size, the memory overhead of first-order (FO) gradient computation via back-propagation (BP) becomes a significant challenge. To address this, the authors propose ZO optimization, which avoids BP and reduces memory costs. They evaluate various ZO optimization methods across five LLM families, three task complexities, and five fine-tuning schemes, revealing previously overlooked optimization principles such as task alignment and the role of forward gradients. The study introduces novel enhancements like block-wise descent, hybrid ZO and FO training, and gradient sparsity to improve ZO-based LLM fine-tuning. The results show that ZO optimization can achieve competitive performance with FO methods while maintaining memory efficiency. The study also highlights the importance of task alignment and the trade-offs between algorithm complexity and fine-tuning performance. The findings suggest that ZO optimization can be a promising approach for memory-efficient LLM fine-tuning, particularly in resource-constrained environments. The authors provide a comprehensive benchmarking framework and code for reproducing their experiments.