25 Nov 2024 | HAIZHOU SHI†, ZIHAO XU, HENGYI WANG, WEIYI QIN, WENYUAN WANG†, and YIBIN WANG†, Rutgers University, USA ZIFENG WANG and SAYNA EBRAHIMI, Google Cloud AI Research, USA HAO WANG†, Rutgers University, USA
The paper "Continual Learning of Large Language Models: A Comprehensive Survey" by Haizhou Shi et al. provides a comprehensive overview of the challenges and advancements in adapting large language models (LLMs) to evolving data distributions. The authors highlight the significant performance degradation in previously learned knowledge domains, known as "catastrophic forgetting," which is a central issue in continual learning (CL). The survey is structured into four main sections:
1. **Overview of Continual Learning of LLMs**: The paper introduces the concept of vertical and horizontal continuity in CL, where vertical continuity involves adapting LLMs from general to specific domains, and horizontal continuity involves adapting across time and domains.
2. **Learning Stages of Continual LLMs**: This section outlines three key stages of LLM learning within modern CL: Continual Pre-Training (CPT), Domain-Adaptive Pre-training (DAP), and Continual Fine-Tuning (CFT). It discusses the effectiveness and efficiency of CPT, the challenges of DAP, and the importance of CFT.
3. **Evaluation Protocols and Data Sources**: The paper reviews evaluation protocols for CL with LLMs and lists available data sources, emphasizing the need for practical and accessible benchmarks.
4. **Discussion and Future Directions**: The authors discuss emerging properties of continual LLMs, changes in the roles of conventional CL types, and the need for further research to address vertical and horizontal forgetting, as well as enable knowledge transfer.
The survey emphasizes the underexplored area of CPT and DAP, highlighting the need for more sophisticated CL techniques to counter forgetting and improve performance. The full list of papers examined in the survey is available at <https://github.com/Wang-ML-Lab/lm-continual-learning-survey>.The paper "Continual Learning of Large Language Models: A Comprehensive Survey" by Haizhou Shi et al. provides a comprehensive overview of the challenges and advancements in adapting large language models (LLMs) to evolving data distributions. The authors highlight the significant performance degradation in previously learned knowledge domains, known as "catastrophic forgetting," which is a central issue in continual learning (CL). The survey is structured into four main sections:
1. **Overview of Continual Learning of LLMs**: The paper introduces the concept of vertical and horizontal continuity in CL, where vertical continuity involves adapting LLMs from general to specific domains, and horizontal continuity involves adapting across time and domains.
2. **Learning Stages of Continual LLMs**: This section outlines three key stages of LLM learning within modern CL: Continual Pre-Training (CPT), Domain-Adaptive Pre-training (DAP), and Continual Fine-Tuning (CFT). It discusses the effectiveness and efficiency of CPT, the challenges of DAP, and the importance of CFT.
3. **Evaluation Protocols and Data Sources**: The paper reviews evaluation protocols for CL with LLMs and lists available data sources, emphasizing the need for practical and accessible benchmarks.
4. **Discussion and Future Directions**: The authors discuss emerging properties of continual LLMs, changes in the roles of conventional CL types, and the need for further research to address vertical and horizontal forgetting, as well as enable knowledge transfer.
The survey emphasizes the underexplored area of CPT and DAP, highlighting the need for more sophisticated CL techniques to counter forgetting and improve performance. The full list of papers examined in the survey is available at <https://github.com/Wang-ML-Lab/lm-continual-learning-survey>.