Found in the Middle: Calibrating Positional Attention Bias Improves Long Context Utilization

Found in the Middle: Calibrating Positional Attention Bias Improves Long Context Utilization

3 Jul 2024 | Cheng-Yu Hsieh, Yung-Sung Chuang, Chun-Liang Li, Zifeng Wang, Long T. Le, Abhishek Kumar, James Glass, Alexander Ratner, Chen-Yu Lee, Ranjay Krishna, Tomas Pfister
This paper addresses the "lost-in-the-middle" problem in large language models (LLMs), where models struggle to locate relevant information in the middle of long input contexts. The authors identify that this issue stems from an intrinsic positional attention bias in LLMs, where tokens at the beginning and end of the input receive higher attention, regardless of their relevance. To mitigate this bias, they propose a calibration mechanism called "found-in-the-middle," which adjusts the model's attention to reflect the true relevance of the input context, rather than its position. The calibration method involves estimating and removing the positional bias from the model's attention scores, allowing the model to attend to relevant contexts more effectively. Experiments show that this approach significantly improves performance in locating relevant information within long contexts and enhances retrieval-augmented generation (RAG) performance across various tasks, outperforming existing methods by up to 15 percentage points. The study also demonstrates that the calibration mechanism can be applied to different LLMs with varying context window lengths, leading to improved performance on tasks such as open-domain question answering. The findings suggest that LLMs are capable of utilizing long contexts effectively when their positional attention bias is mitigated. The research opens new directions for understanding LLM attention biases and their impact on downstream tasks.This paper addresses the "lost-in-the-middle" problem in large language models (LLMs), where models struggle to locate relevant information in the middle of long input contexts. The authors identify that this issue stems from an intrinsic positional attention bias in LLMs, where tokens at the beginning and end of the input receive higher attention, regardless of their relevance. To mitigate this bias, they propose a calibration mechanism called "found-in-the-middle," which adjusts the model's attention to reflect the true relevance of the input context, rather than its position. The calibration method involves estimating and removing the positional bias from the model's attention scores, allowing the model to attend to relevant contexts more effectively. Experiments show that this approach significantly improves performance in locating relevant information within long contexts and enhances retrieval-augmented generation (RAG) performance across various tasks, outperforming existing methods by up to 15 percentage points. The study also demonstrates that the calibration mechanism can be applied to different LLMs with varying context window lengths, leading to improved performance on tasks such as open-domain question answering. The findings suggest that LLMs are capable of utilizing long contexts effectively when their positional attention bias is mitigated. The research opens new directions for understanding LLM attention biases and their impact on downstream tasks.
Reach us at info@study.space