InfLLM: Training-Free Long-Context Extrapolation for LLMs with an Efficient Context Memory

InfLLM: Training-Free Long-Context Extrapolation for LLMs with an Efficient Context Memory

28 May 2024 | Chaojun Xiao1*, Pengle Zhang1*, Xu Han1†, Guangxuan Xiao2, Yankai Lin3, Zhengyan Zhang1, Zhiyuan Liu1†, Maosong Sun1
The paper introduces InfLLM, a training-free method to enhance the context length generalizability of large language models (LLMs). InfLLM addresses the challenge of processing long sequences by incorporating an efficient context memory and a sliding window attention mechanism. The context memory stores distant contexts, allowing LLMs to efficiently process long sequences with limited context windows while capturing long-distance dependencies. The method does not require additional training and can achieve comparable performance to models trained on longer sequences. Experiments on benchmarks like ∞-Bench and LongBench demonstrate that InfLLM enables LLMs to handle sequences up to 1,024K tokens effectively, outperforming other methods that rely on continual training or retrieval-augmented generation. The paper also explores the impact of various parameters in the context memory and provides ablation studies to validate the effectiveness of the proposed approach.The paper introduces InfLLM, a training-free method to enhance the context length generalizability of large language models (LLMs). InfLLM addresses the challenge of processing long sequences by incorporating an efficient context memory and a sliding window attention mechanism. The context memory stores distant contexts, allowing LLMs to efficiently process long sequences with limited context windows while capturing long-distance dependencies. The method does not require additional training and can achieve comparable performance to models trained on longer sequences. Experiments on benchmarks like ∞-Bench and LongBench demonstrate that InfLLM enables LLMs to handle sequences up to 1,024K tokens effectively, outperforming other methods that rely on continual training or retrieval-augmented generation. The paper also explores the impact of various parameters in the context memory and provides ablation studies to validate the effectiveness of the proposed approach.
Reach us at info@study.space
[slides] InfLLM%3A Training-Free Long-Context Extrapolation for LLMs with an Efficient Context Memory | StudySpace