[slides] The Unreasonable Effectiveness of Easy Training Data for Hard Tasks

The paper "The Unreasonable Effectiveness of Easy Training Data for Hard Tasks" explores the phenomenon where pre-trained language models (LMs) can perform well on hard test data even when trained primarily on easy training data. The authors address the scalable oversight problem, which arises from the difficulty of labeling data correctly in specialized domains. They find that LMs can generalize surprisingly well from easy-to-hard data, often achieving performance comparable to or better than models fine-tuned on hard data alone. This is demonstrated using various fine-tuning methods such as in-context learning, linear classifier heads, and QLoRA, across multiple datasets and hardness measures. The study also shows that collecting easy data can be more cost-effective and less noisy than collecting hard data, and that easy-to-hard generalization is robust across different model scales and hardness gaps. The findings suggest that the scalable oversight problem may be more manageable than previously thought, and that LMs can be effectively trained on easier data to handle harder tasks.The paper "The Unreasonable Effectiveness of Easy Training Data for Hard Tasks" explores the phenomenon where pre-trained language models (LMs) can perform well on hard test data even when trained primarily on easy training data. The authors address the scalable oversight problem, which arises from the difficulty of labeling data correctly in specialized domains. They find that LMs can generalize surprisingly well from easy-to-hard data, often achieving performance comparable to or better than models fine-tuned on hard data alone. This is demonstrated using various fine-tuning methods such as in-context learning, linear classifier heads, and QLoRA, across multiple datasets and hardness measures. The study also shows that collecting easy data can be more cost-effective and less noisy than collecting hard data, and that easy-to-hard generalization is robust across different model scales and hardness gaps. The findings suggest that the scalable oversight problem may be more manageable than previously thought, and that LMs can be effectively trained on easier data to handle harder tasks.

The Unreasonable Effectiveness of Easy Training Data for Hard Tasks

5 Jun 2024 | Peter Hase, Mohit Bansal, Peter Clark, Sarah Wiegreffe