xCOT: Cross-lingual Instruction Tuning for Cross-lingual Chain-of-Thought Reasoning

xCOT: Cross-lingual Instruction Tuning for Cross-lingual Chain-of-Thought Reasoning

13 Jan 2024 | Linzhen Chai, Jian Yang, Tao Sun, Hongcheng Guo, Jiaheng Liu, Bing Wang, Xiannian Liang, Jiaqi Bai, Tongliang Li, Qiyao Peng, Zhoujun Li
xCoT: Cross-lingual Instruction Tuning for Cross-lingual Chain-of-Thought Reasoning This paper proposes xCoT, a cross-lingual instruction fine-tuning framework to bridge the gap between high-resource and low-resource languages. The framework uses multilingual instruction data (xCoT-INSTRUCT) to encourage semantic alignment across languages. It introduces cross-lingual in-context few-shot learning (xICL) to accelerate multilingual agreement in instruction tuning, where some fragments of source languages in examples are randomly substituted by their counterpart translations of target languages. During multilingual instruction tuning, the randomly online CoT strategy is used to enhance the multilingual reasoning ability of the large language model by first translating the query to another language and then answering in English. To further facilitate language transfer, high-resource CoT is used to supervise the training of low-resource languages with cross-lingual distillation. Experimental results on previous benchmarks demonstrate the superior performance of xCoT in reducing the gap among different languages, highlighting its potential to reduce the cross-lingual gap. The xCoT framework is evaluated on multilingual benchmarks MGSM of 11 languages and MSVAMP of 10 languages. The results demonstrate that the proposed method consistently achieves state-of-the-art performance across all languages, notably surpassing strong baselines by an average margin of 15%. The contributions of this work include constructing multilingual instruction data to transfer knowledge from high-resource languages to low-resource languages, proposing the random online CoT (Random-CoT) strategy, and aligning the representations of different languages using cross-lingual knowledge and Kullback–Leibler divergence. The xCoT framework is evaluated on multilingual benchmarks MGSM and MSVAMP, showing that it outperforms previous baselines in both tasks. The framework is effective in reducing the cross-lingual gap and demonstrates strong performance in multilingual reasoning. The results highlight the effectiveness of xCoT in narrowing the gap between different languages and its potential as a robust solution for reducing the cross-lingual divide.xCoT: Cross-lingual Instruction Tuning for Cross-lingual Chain-of-Thought Reasoning This paper proposes xCoT, a cross-lingual instruction fine-tuning framework to bridge the gap between high-resource and low-resource languages. The framework uses multilingual instruction data (xCoT-INSTRUCT) to encourage semantic alignment across languages. It introduces cross-lingual in-context few-shot learning (xICL) to accelerate multilingual agreement in instruction tuning, where some fragments of source languages in examples are randomly substituted by their counterpart translations of target languages. During multilingual instruction tuning, the randomly online CoT strategy is used to enhance the multilingual reasoning ability of the large language model by first translating the query to another language and then answering in English. To further facilitate language transfer, high-resource CoT is used to supervise the training of low-resource languages with cross-lingual distillation. Experimental results on previous benchmarks demonstrate the superior performance of xCoT in reducing the gap among different languages, highlighting its potential to reduce the cross-lingual gap. The xCoT framework is evaluated on multilingual benchmarks MGSM of 11 languages and MSVAMP of 10 languages. The results demonstrate that the proposed method consistently achieves state-of-the-art performance across all languages, notably surpassing strong baselines by an average margin of 15%. The contributions of this work include constructing multilingual instruction data to transfer knowledge from high-resource languages to low-resource languages, proposing the random online CoT (Random-CoT) strategy, and aligning the representations of different languages using cross-lingual knowledge and Kullback–Leibler divergence. The xCoT framework is evaluated on multilingual benchmarks MGSM and MSVAMP, showing that it outperforms previous baselines in both tasks. The framework is effective in reducing the cross-lingual gap and demonstrates strong performance in multilingual reasoning. The results highlight the effectiveness of xCoT in narrowing the gap between different languages and its potential as a robust solution for reducing the cross-lingual divide.
Reach us at info@study.space