MULTITASK PROMPTED TRAINING ENABLES ZERO-SHOT TASK GENERALIZATION

MULTITASK PROMPTED TRAINING ENABLES ZERO-SHOT TASK GENERALIZATION

17 Mar 2022 | Victor Sanh*, Albert Webson*, Colin Raffel*, Stephen H. Bach*, Lintang Sutawika, Zaid Alyafeai, Antoine Chaffin, Arnaud Stiegler, Teven Le Scao, Arun Raja, Manan Dey, M Saiful Bari, Canwen Xu, Urmish Thakker, Shanya Sharma, Eliza Szczeczka, Gunjan Chhablani, Nihal V. Nayak, Debjayoti Datta, Jonathan Chang, Mike Tian-Jian Jiang, Matteo Manica, Sheng Shen, Zheng-Xin Yong, Harshit Pandey, Rachel Bawden, Thomas Wang, Trishala Neeraj, Abheech Sharma, Andrea Santilli, Thibault Fevry, Jos Rozen, Tali Bers, Stella Biderman, Leo Gao, Ryan Teehan, Alexander M. Rush
This paper explores whether explicit multitask learning can improve zero-shot generalization in large language models. The authors propose a system for converting a wide range of natural language tasks into a human-readable prompted format, enabling benchmarking of a model's ability to perform completely held-out tasks. They fine-tune a pretrained encoder-decoder model on a multitask mixture of NLP datasets, achieving strong zero-shot performance on several standard datasets, often outperforming models up to 16× its size. The model also performs well on a subset of tasks from the BIG-bench benchmark, outperforming models up to 6× its size. The authors find that multitask prompted training improves generalization to held-out tasks and that training on a wider range of prompts improves robustness to prompt wording. They also show that training on more prompts per dataset leads to better and more robust generalization to held-out tasks. The results suggest that multitask prompted training can enable strong zero-shot generalization abilities in language models, providing an effective alternative to unsupervised language model pretraining. The authors release all models trained in this paper along with the collection of prompts they created and their prompt annotation tool.This paper explores whether explicit multitask learning can improve zero-shot generalization in large language models. The authors propose a system for converting a wide range of natural language tasks into a human-readable prompted format, enabling benchmarking of a model's ability to perform completely held-out tasks. They fine-tune a pretrained encoder-decoder model on a multitask mixture of NLP datasets, achieving strong zero-shot performance on several standard datasets, often outperforming models up to 16× its size. The model also performs well on a subset of tasks from the BIG-bench benchmark, outperforming models up to 6× its size. The authors find that multitask prompted training improves generalization to held-out tasks and that training on a wider range of prompts improves robustness to prompt wording. They also show that training on more prompts per dataset leads to better and more robust generalization to held-out tasks. The results suggest that multitask prompted training can enable strong zero-shot generalization abilities in language models, providing an effective alternative to unsupervised language model pretraining. The authors release all models trained in this paper along with the collection of prompts they created and their prompt annotation tool.
Reach us at info@study.space