17 Jun 2024 | HAOPENG ZHANG, University of Hawaii, Manoa, USA; PHILIP S. YU, University of Illinois at Chicago, USA; JIAWEI ZHANG, University of California, Davis, USA
This survey provides a comprehensive review of the evolution and advancements in text summarization research, focusing on the transition from statistical methods to large language models (LLMs). The paper is organized into two main parts: (1) an overview of datasets, evaluation metrics, and summarization methods before the LLM era, and (2) an examination of recent advancements in benchmarking, modeling, and evaluating summarization in the LLM era. The authors discuss the problem formulation, evaluation metrics, and commonly used datasets, and categorize summarization methods into four stages: statistical, deep learning, pre-trained language model fine-tuning, and LLMs. They also propose a new taxonomy of LLM-based summarization literature and analyze the unique features of each method. The survey highlights the challenges and open problems in the field, such as factual consistency and coherence, and outlines promising research directions to guide future developments in text summarization.This survey provides a comprehensive review of the evolution and advancements in text summarization research, focusing on the transition from statistical methods to large language models (LLMs). The paper is organized into two main parts: (1) an overview of datasets, evaluation metrics, and summarization methods before the LLM era, and (2) an examination of recent advancements in benchmarking, modeling, and evaluating summarization in the LLM era. The authors discuss the problem formulation, evaluation metrics, and commonly used datasets, and categorize summarization methods into four stages: statistical, deep learning, pre-trained language model fine-tuning, and LLMs. They also propose a new taxonomy of LLM-based summarization literature and analyze the unique features of each method. The survey highlights the challenges and open problems in the field, such as factual consistency and coherence, and outlines promising research directions to guide future developments in text summarization.