Understanding xCoT%3A Cross-lingual Instruction Tuning for Cross-lingual Chain-of-Thought Reasoning

The paper introduces xCOT (Cross-lingual Instruction Tuning for Cross-lingual Chain-of-Thought Reasoning), a framework designed to bridge the gap between high-resource and low-resource languages in large language models (LLMs). xCOT addresses the challenge of poor language generalization in low-resource languages by transferring knowledge from high-resource languages. The key contributions of xCOT include: 1. **xCOT-INSTRUCT Dataset**: A multilingual instruction dataset is created by translating English instructions into other languages, encouraging semantic alignment across multiple languages. 2. **Cross-lingual In-context Few-shot Learning (xICL)**: This technique enhances multilingual agreement in instruction tuning by mixing source and target language tokens in examples. 3. **Random Online CoT Strategy**: During training, the model is prompted to translate queries into other languages and answer in English, enhancing multilingual reasoning ability. 4. **Cross-lingual Distillation**: High-resource CoT is used to supervise the training of low-resource languages, facilitating language transfer. Experimental results on benchmarks such as MSGM and MSVAMP demonstrate that xCOT significantly reduces the gap between different languages, achieving state-of-the-art performance across all languages. The method's effectiveness is further validated through ablation studies and analysis of cross-lingual prompting, multilingual representations, and low-resource settings. Overall, xCOT shows promise as a robust solution for reducing the cross-lingual divide in multilingual language models.The paper introduces xCOT (Cross-lingual Instruction Tuning for Cross-lingual Chain-of-Thought Reasoning), a framework designed to bridge the gap between high-resource and low-resource languages in large language models (LLMs). xCOT addresses the challenge of poor language generalization in low-resource languages by transferring knowledge from high-resource languages. The key contributions of xCOT include: 1. **xCOT-INSTRUCT Dataset**: A multilingual instruction dataset is created by translating English instructions into other languages, encouraging semantic alignment across multiple languages. 2. **Cross-lingual In-context Few-shot Learning (xICL)**: This technique enhances multilingual agreement in instruction tuning by mixing source and target language tokens in examples. 3. **Random Online CoT Strategy**: During training, the model is prompted to translate queries into other languages and answer in English, enhancing multilingual reasoning ability. 4. **Cross-lingual Distillation**: High-resource CoT is used to supervise the training of low-resource languages, facilitating language transfer. Experimental results on benchmarks such as MSGM and MSVAMP demonstrate that xCOT significantly reduces the gap between different languages, achieving state-of-the-art performance across all languages. The method's effectiveness is further validated through ablation studies and analysis of cross-lingual prompting, multilingual representations, and low-resource settings. Overall, xCOT shows promise as a robust solution for reducing the cross-lingual divide in multilingual language models.

xCOT: Cross-lingual Instruction Tuning for Cross-lingual Chain-of-Thought Reasoning

2024-01-13 | Linzhen Chai, Jian Yang, Tao Sun, Hongcheng Guo, Jiaheng Liu, Bing Wang, Xiannian Liang, Jiaqi Bai, Tongliang Li, Qiyao Peng, Zhoujun Li