Understanding Multitask Prompted Training Enables Zero-Shot Task Generalization

The paper explores the effectiveness of explicit multitask learning in enabling zero-shot generalization in large language models. It introduces a system that converts natural language tasks into prompted forms, allowing for benchmarking the model's ability to perform tasks it has not been explicitly trained on. The authors fine-tune a pre-trained encoder-decoder model on a multitask mixture of datasets, each with multiple prompts, and evaluate its performance on held-out tasks. The model achieves strong zero-shot performance on several standard datasets, often outperforming models up to 16 times its size. Additionally, the approach performs well on a subset of tasks from the BIG-bench benchmark, outperforming models up to 6 times their size. The paper also investigates the impact of prompt diversity and the number of prompts per dataset on the model's robustness to prompt wording. The results demonstrate that multitask prompted training can significantly improve zero-shot generalization, providing an effective alternative to unsupervised pretraining.The paper explores the effectiveness of explicit multitask learning in enabling zero-shot generalization in large language models. It introduces a system that converts natural language tasks into prompted forms, allowing for benchmarking the model's ability to perform tasks it has not been explicitly trained on. The authors fine-tune a pre-trained encoder-decoder model on a multitask mixture of datasets, each with multiple prompts, and evaluate its performance on held-out tasks. The model achieves strong zero-shot performance on several standard datasets, often outperforming models up to 16 times its size. Additionally, the approach performs well on a subset of tasks from the BIG-bench benchmark, outperforming models up to 6 times their size. The paper also investigates the impact of prompt diversity and the number of prompts per dataset on the model's robustness to prompt wording. The results demonstrate that multitask prompted training can significantly improve zero-shot generalization, providing an effective alternative to unsupervised pretraining.

MULTITASK PROMPTED TRAINING ENABLES ZERO-SHOT TASK GENERALIZATION