A Survey on Knowledge Distillation of Large Language Models

A Survey on Knowledge Distillation of Large Language Models

8 Mar 2024 | Xiaohan Xu, Ming Li, Chongyang Tao, Tao Shen, Reynold Cheng, Jinyang Li, Can Xu, Dacheng Tao, Tianyi Zhou
This paper provides a comprehensive survey of Knowledge Distillation (KD) in the context of Large Language Models (LLMs), highlighting its critical role in enhancing the capabilities of open-source LLMs. The survey is structured around three foundational pillars: algorithm, skill, and verticalization, offering a detailed examination of KD mechanisms, the enhancement of specific cognitive abilities, and their practical implications across diverse fields. The paper emphasizes the interplay between data augmentation (DA) and KD, illustrating how DA enhances LLMs' performance by generating context-rich, skill-specific training data. The benefits of KD include bridging the performance gap between proprietary and open-source LLMs, improving computational efficiency, and fostering a more accessible and equitable AI landscape. The survey also discusses the challenges and future research directions in KD, advocating for ethical and legal compliance in the use of LLMs. The work aims to guide researchers and practitioners in understanding and applying current methodologies in knowledge distillation, ultimately contributing to more robust, versatile, and accessible AI solutions.This paper provides a comprehensive survey of Knowledge Distillation (KD) in the context of Large Language Models (LLMs), highlighting its critical role in enhancing the capabilities of open-source LLMs. The survey is structured around three foundational pillars: algorithm, skill, and verticalization, offering a detailed examination of KD mechanisms, the enhancement of specific cognitive abilities, and their practical implications across diverse fields. The paper emphasizes the interplay between data augmentation (DA) and KD, illustrating how DA enhances LLMs' performance by generating context-rich, skill-specific training data. The benefits of KD include bridging the performance gap between proprietary and open-source LLMs, improving computational efficiency, and fostering a more accessible and equitable AI landscape. The survey also discusses the challenges and future research directions in KD, advocating for ethical and legal compliance in the use of LLMs. The work aims to guide researchers and practitioners in understanding and applying current methodologies in knowledge distillation, ultimately contributing to more robust, versatile, and accessible AI solutions.
Reach us at info@study.space