[slides and audio] Generative Representational Instruction Tuning

The paper introduces Generative Representational Instruction Tuning (GRIT), a method that trains large language models to handle both generative and embedding tasks simultaneously by distinguishing between them through instructions. GRIT unifies the two training paradigms: generative instruction tuning and representational instruction tuning. The resulting models, GRiTLM 7B and GRiTLM 8x7B, outperform existing models on the Massive Text Embedding Benchmark (MTEB) and various generative tasks, respectively. GRIT's unified approach simplifies infrastructure needs and speeds up Retrieval-Augmented Generation (RAG) by over 60% for long documents. The paper also discusses ablation studies and the benefits of GRIT in RAG, highlighting its potential for further unification in multilinguality and multimodality tasks.The paper introduces Generative Representational Instruction Tuning (GRIT), a method that trains large language models to handle both generative and embedding tasks simultaneously by distinguishing between them through instructions. GRIT unifies the two training paradigms: generative instruction tuning and representational instruction tuning. The resulting models, GRiTLM 7B and GRiTLM 8x7B, outperform existing models on the Massive Text Embedding Benchmark (MTEB) and various generative tasks, respectively. GRIT's unified approach simplifies infrastructure needs and speeds up Retrieval-Augmented Generation (RAG) by over 60% for long documents. The paper also discusses ablation studies and the benefits of GRIT in RAG, highlighting its potential for further unification in multilinguality and multimodality tasks.

Generative Representational Instruction Tuning

2024-04-17 | Niklas Muennighoff, Hongjin Su, Liang Wang, Nan Yang, Furu Wei, Tao Yu, Amanpreet Singh, Douwe Kiela