Beyond Efficiency: A Systematic Survey of Resource-Efficient Large Language Models

Beyond Efficiency: A Systematic Survey of Resource-Efficient Large Language Models

4 Jan 2024 | Guangji Bai, Zheng Chai, Chen Ling, Shiyu Wang, Jiaying Lu, Nan Zhang, Tingwei Shi, Ziyang Yu, Mengdan Zhu, Yifei Zhang, Carl Yang, Yue Cheng, Liang Zhao
This paper provides a comprehensive survey of techniques aimed at enhancing the resource efficiency of Large Language Models (LLMs). The authors categorize these techniques based on their focus— computational, memory, energy, financial, and network resources—and their applicability across various stages of an LLM’s lifecycle, including architecture design, pre-training, fine-tuning, and system design. The survey introduces a nuanced taxonomy of resource efficiency techniques and presents standardized evaluation metrics and datasets to facilitate consistent and fair comparisons. The paper also discusses the challenges in creating resource-efficient LLMs, such as low parallelism in auto-regressive generation, quadratic complexity in self-attention layers, scaling laws, generalization, system design, and ethical considerations. The authors highlight the importance of efficient architecture design, pre-training strategies, fine-tuning techniques, and system optimizations to address these challenges. The paper concludes with a discussion on current bottlenecks and future research directions, emphasizing the need for more sustainable and efficient LLMs.This paper provides a comprehensive survey of techniques aimed at enhancing the resource efficiency of Large Language Models (LLMs). The authors categorize these techniques based on their focus— computational, memory, energy, financial, and network resources—and their applicability across various stages of an LLM’s lifecycle, including architecture design, pre-training, fine-tuning, and system design. The survey introduces a nuanced taxonomy of resource efficiency techniques and presents standardized evaluation metrics and datasets to facilitate consistent and fair comparisons. The paper also discusses the challenges in creating resource-efficient LLMs, such as low parallelism in auto-regressive generation, quadratic complexity in self-attention layers, scaling laws, generalization, system design, and ethical considerations. The authors highlight the importance of efficient architecture design, pre-training strategies, fine-tuning techniques, and system optimizations to address these challenges. The paper concludes with a discussion on current bottlenecks and future research directions, emphasizing the need for more sustainable and efficient LLMs.
Reach us at info@study.space