LOFiT: Localized Fine-tuning on LLM Representations

LOFiT: Localized Fine-tuning on LLM Representations

3 Jun 2024 | Fangcong Yin, Xi Ye, Greg Durrett
LoFiT: Localized Fine-Tuning on LLM Representations This paper introduces LoFiT, a localized fine-tuning method for large language models (LLMs) that selects a subset of attention heads and learns task-specific offset vectors to be added to the hidden representations of the targeted attention heads. LoFiT is compared with representation intervention methods and other parameter-efficient fine-tuning (PEFT) methods. The results show that LoFiT achieves comparable performance to other PEFT methods such as LoRA and RED, despite modifying 20x-200x fewer parameters. LoFiT is also effective for tasks involving truthfulness and reasoning, outperforming representation intervention methods such as Inference-Time Intervention (ITI) and Representation Engineering (RepE). The localization step is important for LoFiT, as selecting a task-specific set of attention heads leads to higher performance than intervening on heads selected for a different task. LoFiT is also effective for out-of-domain generalization, as it can generalize well to tasks not seen during training. The main contributions of this work are the introduction of LoFiT, a localized fine-tuning method that achieves competitive downstream performance on truthfulness and reasoning tasks by modifying the representations of a small number of attention heads, and the demonstration that localization to particular sets of heads across tasks and across models can lead to better performance.LoFiT: Localized Fine-Tuning on LLM Representations This paper introduces LoFiT, a localized fine-tuning method for large language models (LLMs) that selects a subset of attention heads and learns task-specific offset vectors to be added to the hidden representations of the targeted attention heads. LoFiT is compared with representation intervention methods and other parameter-efficient fine-tuning (PEFT) methods. The results show that LoFiT achieves comparable performance to other PEFT methods such as LoRA and RED, despite modifying 20x-200x fewer parameters. LoFiT is also effective for tasks involving truthfulness and reasoning, outperforming representation intervention methods such as Inference-Time Intervention (ITI) and Representation Engineering (RepE). The localization step is important for LoFiT, as selecting a task-specific set of attention heads leads to higher performance than intervening on heads selected for a different task. LoFiT is also effective for out-of-domain generalization, as it can generalize well to tasks not seen during training. The main contributions of this work are the introduction of LoFiT, a localized fine-tuning method that achieves competitive downstream performance on truthfulness and reasoning tasks by modifying the representations of a small number of attention heads, and the demonstration that localization to particular sets of heads across tasks and across models can lead to better performance.
Reach us at info@study.space