[slides] A Closer Look at the Limitations of Instruction Tuning

This paper explores the limitations of Instruction Tuning (IT), a method used to transform large language models (LLMs) into open-domain conversational agents. While IT has achieved significant success, its shortcomings are underexplored. The study reveals several limitations: 1. **Knowledge Enhancement**: IT does not enhance knowledge or skills in LLMs. LoRA fine-tuning only learns response initiation and style tokens, while full-parameter fine-tuning leads to knowledge degradation. 2. **Pattern Copying**: Copying response patterns from IT datasets derived from knowledgeable sources often leads to a decline in response quality. 3. **Hallucination**: Full-parameter fine-tuning increases hallucination by inaccurately borrowing tokens from conceptually similar instances in the IT dataset. 4. **Improvement Methods**: Popular methods to improve IT do not lead to performance improvements over a simple LoRA fine-tuned model. The findings suggest that responses generated solely from pre-trained knowledge consistently outperform those learned from IT on open-source datasets. The paper aims to inspire future research in addressing these challenges.This paper explores the limitations of Instruction Tuning (IT), a method used to transform large language models (LLMs) into open-domain conversational agents. While IT has achieved significant success, its shortcomings are underexplored. The study reveals several limitations: 1. **Knowledge Enhancement**: IT does not enhance knowledge or skills in LLMs. LoRA fine-tuning only learns response initiation and style tokens, while full-parameter fine-tuning leads to knowledge degradation. 2. **Pattern Copying**: Copying response patterns from IT datasets derived from knowledgeable sources often leads to a decline in response quality. 3. **Hallucination**: Full-parameter fine-tuning increases hallucination by inaccurately borrowing tokens from conceptually similar instances in the IT dataset. 4. **Improvement Methods**: Popular methods to improve IT do not lead to performance improvements over a simple LoRA fine-tuned model. The findings suggest that responses generated solely from pre-trained knowledge consistently outperform those learned from IT on open-source datasets. The paper aims to inspire future research in addressing these challenges.

A Closer Look at the Limitations of Instruction Tuning

2024 | Sreyan Ghosh *1 2 Chandra Kiran Reddy Evuru * 1 Sonal Kumar * 1 Ramaneswaran S 3 Deepali Aneja 2 Zeyu Jin 2 Ramani Duraiswami * 1 Dinesh Manocha * 1

2024 | Sreyan Ghosh 1 2 Chandra Kiran Reddy Evuru 1 Sonal Kumar * 1 Ramaneswaran S 3 Deepali Aneja 2 Zeyu Jin 2 Ramani Duraiswami * 1 Dinesh Manocha * 1