Multilingual Instruction Tuning With Just a Pinch of Multilinguality

Multilingual Instruction Tuning With Just a Pinch of Multilinguality

21 May 2024 | Uri Shacham, Jonathan Herzig, Roei Aharoni, Idan Szpektor, Reut Tsarfaty, Matan Eyal
This paper investigates how multilinguality during instruction tuning affects instruction-following across languages in large language models (LLMs). The study shows that even a small amount of multilingual data can significantly improve multilingual instruction-following, both in seen and unseen languages. The research demonstrates that models tuned on multilingual mixtures exhibit comparable or superior performance in multiple languages compared to monolingually tuned models, despite training on 10x fewer examples in those languages. Additionally, diversifying the instruction tuning set with even just 2-4 languages significantly improves cross-lingual generalization. The results suggest that building massively multilingual instruction-tuned models can be done with only a very small set of multilingual instruction-responses. The study uses an LLM pre-trained on hundreds of languages and high-quality, open-ended instructions and responses translated into 11 languages. The research explores the transferability of monolingual instruction tuning across different languages and finds that tuning using English, Italian, or Spanish yields the best average multilingual performance. The study also shows that replacing just 40 English training examples with multilingual examples significantly improves instruction-following in those languages. Surprisingly, this small amount of language-diverse examples also improves performance for languages that are only seen during pre-training and are not represented in the instruction tuning set at all. The study further examines how increasing the number of languages in the tuning set can enhance generalization to new languages from the pre-training corpus. It finds that tuning using a few languages enables better performance for languages unseen during tuning compared to monolingual tuning with the same number of examples. The research also investigates potential factors that might influence the degree of cross-lingual transfer, such as language similarity and the amount of language-specific pre-training data, but finds no significant correlations. The study concludes that cross-lingual transfer offers a promising avenue for building multilingual instruction-following LLMs. The findings suggest that even monolingual instruction tuning using only one language can result in improved instruction-following capabilities in other languages. Moreover, incorporating even a small set of a few dozen multilingual examples can significantly enhance instruction-following performance for both the languages the model is tuned on, and ones that were only seen during pre-training. The results also show that training on such multilingual datasets achieves comparable or even superior performance compared to monolingual tuning for some languages. The study highlights the potential of just a small amount of language diversity in the instruction tuning set for cross-lingual generalization.This paper investigates how multilinguality during instruction tuning affects instruction-following across languages in large language models (LLMs). The study shows that even a small amount of multilingual data can significantly improve multilingual instruction-following, both in seen and unseen languages. The research demonstrates that models tuned on multilingual mixtures exhibit comparable or superior performance in multiple languages compared to monolingually tuned models, despite training on 10x fewer examples in those languages. Additionally, diversifying the instruction tuning set with even just 2-4 languages significantly improves cross-lingual generalization. The results suggest that building massively multilingual instruction-tuned models can be done with only a very small set of multilingual instruction-responses. The study uses an LLM pre-trained on hundreds of languages and high-quality, open-ended instructions and responses translated into 11 languages. The research explores the transferability of monolingual instruction tuning across different languages and finds that tuning using English, Italian, or Spanish yields the best average multilingual performance. The study also shows that replacing just 40 English training examples with multilingual examples significantly improves instruction-following in those languages. Surprisingly, this small amount of language-diverse examples also improves performance for languages that are only seen during pre-training and are not represented in the instruction tuning set at all. The study further examines how increasing the number of languages in the tuning set can enhance generalization to new languages from the pre-training corpus. It finds that tuning using a few languages enables better performance for languages unseen during tuning compared to monolingual tuning with the same number of examples. The research also investigates potential factors that might influence the degree of cross-lingual transfer, such as language similarity and the amount of language-specific pre-training data, but finds no significant correlations. The study concludes that cross-lingual transfer offers a promising avenue for building multilingual instruction-following LLMs. The findings suggest that even monolingual instruction tuning using only one language can result in improved instruction-following capabilities in other languages. Moreover, incorporating even a small set of a few dozen multilingual examples can significantly enhance instruction-following performance for both the languages the model is tuned on, and ones that were only seen during pre-training. The results also show that training on such multilingual datasets achieves comparable or even superior performance compared to monolingual tuning for some languages. The study highlights the potential of just a small amount of language diversity in the instruction tuning set for cross-lingual generalization.
Reach us at info@study.space
Understanding Multilingual Instruction Tuning With Just a Pinch of Multilinguality