LONGAGENT: Scaling Language Models to 128k Context through Multi-Agent Collaboration

LONGAGENT: Scaling Language Models to 128k Context through Multi-Agent Collaboration

13 Mar 2024 | Jun Zhao, Can Zu, Hao Xu, Yi Lu, Wei He, Yiwen Ding, Tao Gui, Qi Zhang, Xuanjing Huang
The paper introduces LONGAGENT, a novel method for scaling large language models (LLMs) to handle long texts with a context window of 128k tokens. LONGAGENT leverages multi-agent collaboration to address the limitations of LLMs in processing long inputs, which often suffer from high training costs and inference latency. The method involves a leader agent that coordinates multiple member agents to gather information from document chunks, resolve conflicts caused by hallucinations, and derive final answers. Experimental results show that LONGAGENT, built on LLaMA-7B, outperforms GPT-4 in tasks such as 128k-long text retrieval and multi-hop question answering, demonstrating its potential as a promising alternative for long-text processing. The paper also introduces a new benchmark, *Needle in a Haystack PLUS*, to comprehensively evaluate LLMs' long-text capabilities, further enhancing the evaluation of long-text processing methods.The paper introduces LONGAGENT, a novel method for scaling large language models (LLMs) to handle long texts with a context window of 128k tokens. LONGAGENT leverages multi-agent collaboration to address the limitations of LLMs in processing long inputs, which often suffer from high training costs and inference latency. The method involves a leader agent that coordinates multiple member agents to gather information from document chunks, resolve conflicts caused by hallucinations, and derive final answers. Experimental results show that LONGAGENT, built on LLaMA-7B, outperforms GPT-4 in tasks such as 128k-long text retrieval and multi-hop question answering, demonstrating its potential as a promising alternative for long-text processing. The paper also introduces a new benchmark, *Needle in a Haystack PLUS*, to comprehensively evaluate LLMs' long-text capabilities, further enhancing the evaluation of long-text processing methods.
Reach us at info@study.space