5 Mar 2024 | Yutong Li, Lu Chen, Aiwei Liu, Kai Yu, Lijie Wen
ChatCite is an LLM-based agent designed for comparative literature summaries, incorporating human workflow guidance to enhance the quality and comprehensiveness of generated summaries. The agent follows a two-module structure: the Key Element Extractor and the Reflective Incremental Generator. The Key Element Extractor identifies essential elements from literature, while the Reflective Incremental Generator iteratively refines summaries through comparative analysis and evaluation. This approach addresses challenges such as missing key elements, lack of comparative analysis, and poor organizational structure in traditional LLM-generated summaries. ChatCite outperforms other models in various dimensions, including quality, coherence, and citation accuracy. It also introduces G-Score, an LLM-based automatic evaluation metric that aligns with human evaluations. Experimental results show that ChatCite generates more accurate and comprehensive literature summaries, which can be directly used for drafting literature reviews. The study highlights the effectiveness of human workflow guidance in improving the quality and stability of LLM-generated summaries, demonstrating the potential of LLMs in complex inferential summarization tasks. The work also identifies limitations, such as the focus on specific topics and the need for further research to improve the stability and accuracy of generated results. Overall, ChatCite contributes to the advancement of literature summarization by providing a structured, human-guided approach that enhances the quality and comprehensiveness of generated summaries.ChatCite is an LLM-based agent designed for comparative literature summaries, incorporating human workflow guidance to enhance the quality and comprehensiveness of generated summaries. The agent follows a two-module structure: the Key Element Extractor and the Reflective Incremental Generator. The Key Element Extractor identifies essential elements from literature, while the Reflective Incremental Generator iteratively refines summaries through comparative analysis and evaluation. This approach addresses challenges such as missing key elements, lack of comparative analysis, and poor organizational structure in traditional LLM-generated summaries. ChatCite outperforms other models in various dimensions, including quality, coherence, and citation accuracy. It also introduces G-Score, an LLM-based automatic evaluation metric that aligns with human evaluations. Experimental results show that ChatCite generates more accurate and comprehensive literature summaries, which can be directly used for drafting literature reviews. The study highlights the effectiveness of human workflow guidance in improving the quality and stability of LLM-generated summaries, demonstrating the potential of LLMs in complex inferential summarization tasks. The work also identifies limitations, such as the focus on specific topics and the need for further research to improve the stability and accuracy of generated results. Overall, ChatCite contributes to the advancement of literature summarization by providing a structured, human-guided approach that enhances the quality and comprehensiveness of generated summaries.