**Summary:**
This paper introduces Information-Intensive (IN2) training to address the "lost-in-the-middle" challenge in long-context large language models (LLMs). The challenge refers to the inability of LLMs to effectively utilize information in the middle of a long context. IN2 training is a data-driven approach that uses a synthesized long-context question-answer dataset, where answers require fine-grained information awareness on short segments within a long context and integration of information from multiple segments. Applying IN2 training on Mistral-7B results in FILM-7B, which demonstrates robust performance in retrieving information from different positions in its 32K context window. FILM-7B significantly improves performance on real-world long-context tasks, such as NarrativeQA, while maintaining comparable performance on short-context tasks. The study also introduces VAL Probing, a set of tasks that evaluate long-context information awareness across various context styles and retrieval patterns. Results show that FILM-7B outperforms other models in these tasks, indicating that IN2 training effectively enhances the ability of LLMs to utilize long contexts. The paper also discusses training strategies, including the use of sliding windows and RoPE base theta adjustments, to further improve the effectiveness of IN2 training.**Summary:**
This paper introduces Information-Intensive (IN2) training to address the "lost-in-the-middle" challenge in long-context large language models (LLMs). The challenge refers to the inability of LLMs to effectively utilize information in the middle of a long context. IN2 training is a data-driven approach that uses a synthesized long-context question-answer dataset, where answers require fine-grained information awareness on short segments within a long context and integration of information from multiple segments. Applying IN2 training on Mistral-7B results in FILM-7B, which demonstrates robust performance in retrieving information from different positions in its 32K context window. FILM-7B significantly improves performance on real-world long-context tasks, such as NarrativeQA, while maintaining comparable performance on short-context tasks. The study also introduces VAL Probing, a set of tasks that evaluate long-context information awareness across various context styles and retrieval patterns. Results show that FILM-7B outperforms other models in these tasks, indicating that IN2 training effectively enhances the ability of LLMs to utilize long contexts. The paper also discusses training strategies, including the use of sliding windows and RoPE base theta adjustments, to further improve the effectiveness of IN2 training.