[slides] Qwen2 Technical Report

This technical report introduces the Qwen2 series, a comprehensive suite of foundational and instruction-tuned language models with parameter counts ranging from 0.5 to 72 billion. The models, which include dense and Mixture-of-Experts (MoE) architectures, surpass previous open-weight models, including Qwen1.5, and perform competitively against proprietary models across various benchmarks in language understanding, generation, multilingual proficiency, coding, mathematics, and reasoning. The flagship model, Qwen2-72B, achieves notable scores on multiple benchmarks, and the instruction-tuned variant, Qwen2-72B-Instruct, excels in specific tasks like MT-Bench, Arena-Hard, and LiveCodeBench. Qwen2 also demonstrates robust multilingual capabilities, supporting approximately 30 languages. To promote community innovation and accessibility, the Qwen2 model weights are openly available on Hugging Face and ModelScope, along with supplementary materials and resources for quantization, fine-tuning, and deployment. The report details the model architecture, pre-training and post-training processes, and comprehensive evaluation protocols, highlighting Qwen2's superior performance and its potential for diverse applications and research.This technical report introduces the Qwen2 series, a comprehensive suite of foundational and instruction-tuned language models with parameter counts ranging from 0.5 to 72 billion. The models, which include dense and Mixture-of-Experts (MoE) architectures, surpass previous open-weight models, including Qwen1.5, and perform competitively against proprietary models across various benchmarks in language understanding, generation, multilingual proficiency, coding, mathematics, and reasoning. The flagship model, Qwen2-72B, achieves notable scores on multiple benchmarks, and the instruction-tuned variant, Qwen2-72B-Instruct, excels in specific tasks like MT-Bench, Arena-Hard, and LiveCodeBench. Qwen2 also demonstrates robust multilingual capabilities, supporting approximately 30 languages. To promote community innovation and accessibility, the Qwen2 model weights are openly available on Hugging Face and ModelScope, along with supplementary materials and resources for quantization, fine-tuning, and deployment. The report details the model architecture, pre-training and post-training processes, and comprehensive evaluation protocols, highlighting Qwen2's superior performance and its potential for diverse applications and research.