Show, Don't Tell: Aligning Language Models with Demonstrated Feedback

Show, Don't Tell: Aligning Language Models with Demonstrated Feedback

2 Jun 2024 | Omar Shaikh*, Michelle Lam*, Joey Hejna*, Yijia Shao, Michael Bernstein, Diyi Yang
DITTO is a method for aligning large language models (LLMs) with specific user behaviors using a small number of demonstrations. Unlike traditional methods that require large datasets, DITTO leverages user-provided examples to generate online comparison data, enabling effective customization of LLMs. The method, derived from online imitation learning, treats user demonstrations as preferred over model outputs and uses them to update the model. DITTO outperforms other alignment methods like few-shot prompting, supervised fine-tuning, and self-play methods in both benchmark tests and user studies. It achieves higher win rates across various tasks, including email writing, essays, and articles. DITTO is sample-efficient, requiring only a few demonstrations to achieve strong alignment. The method is also effective in real-world scenarios, where users provide demonstrations for tasks like email writing. DITTO's performance is validated through extensive experiments, showing its effectiveness in aligning LLMs to specific user preferences. The method is particularly useful for tasks where data is limited, as it requires fewer samples than traditional methods. DITTO's approach allows for efficient customization of LLMs, making it a promising solution for personalized language model alignment.DITTO is a method for aligning large language models (LLMs) with specific user behaviors using a small number of demonstrations. Unlike traditional methods that require large datasets, DITTO leverages user-provided examples to generate online comparison data, enabling effective customization of LLMs. The method, derived from online imitation learning, treats user demonstrations as preferred over model outputs and uses them to update the model. DITTO outperforms other alignment methods like few-shot prompting, supervised fine-tuning, and self-play methods in both benchmark tests and user studies. It achieves higher win rates across various tasks, including email writing, essays, and articles. DITTO is sample-efficient, requiring only a few demonstrations to achieve strong alignment. The method is also effective in real-world scenarios, where users provide demonstrations for tasks like email writing. DITTO's performance is validated through extensive experiments, showing its effectiveness in aligning LLMs to specific user preferences. The method is particularly useful for tasks where data is limited, as it requires fewer samples than traditional methods. DITTO's approach allows for efficient customization of LLMs, making it a promising solution for personalized language model alignment.
Reach us at info@study.space
Understanding Aligning Language Models with Demonstrated Feedback