2 Jun 2021 | Tianyu Gao†*, Adam Fisch†*, Danqi Chen†
This paper introduces LM-BFF, a set of simple and effective techniques for fine-tuning language models using only a few examples. The goal is to improve few-shot learning performance by combining prompt-based fine-tuning with automatically generated prompts and incorporating task demonstrations into the input context. The approach includes (1) prompt-based fine-tuning with a novel pipeline for automating prompt generation; and (2) a refined strategy for dynamically and selectively incorporating demonstrations into each context. The methods are evaluated on a range of NLP tasks, including classification and regression, and show significant improvements over standard fine-tuning procedures, achieving up to 30% absolute improvement and 11% on average across all tasks. The approach makes minimal assumptions on task resources and domain expertise, making it a strong task-agnostic method for few-shot learning. The paper also discusses related work, including language model prompting, automatic prompt search, and fine-tuning of language models. It presents a systematic evaluation of few-shot performance on 8 single-sentence and 7 sentence-pair NLP tasks, showing that prompt-based fine-tuning largely outperforms standard fine-tuning, and that incorporating demonstrations is effective for fine-tuning and boosts few-shot performance. The results demonstrate that the proposed methods contribute to a dramatic improvement across the tasks evaluated.This paper introduces LM-BFF, a set of simple and effective techniques for fine-tuning language models using only a few examples. The goal is to improve few-shot learning performance by combining prompt-based fine-tuning with automatically generated prompts and incorporating task demonstrations into the input context. The approach includes (1) prompt-based fine-tuning with a novel pipeline for automating prompt generation; and (2) a refined strategy for dynamically and selectively incorporating demonstrations into each context. The methods are evaluated on a range of NLP tasks, including classification and regression, and show significant improvements over standard fine-tuning procedures, achieving up to 30% absolute improvement and 11% on average across all tasks. The approach makes minimal assumptions on task resources and domain expertise, making it a strong task-agnostic method for few-shot learning. The paper also discusses related work, including language model prompting, automatic prompt search, and fine-tuning of language models. It presents a systematic evaluation of few-shot performance on 8 single-sentence and 7 sentence-pair NLP tasks, showing that prompt-based fine-tuning largely outperforms standard fine-tuning, and that incorporating demonstrations is effective for fine-tuning and boosts few-shot performance. The results demonstrate that the proposed methods contribute to a dramatic improvement across the tasks evaluated.