ChatGLM is a family of large language models developed by Zhipu AI and Tsinghua University, including GLM-4, GLM-4-Air, and GLM-4-9B. These models are trained on ten trillion tokens, primarily in Chinese and English, and are aligned for Chinese and English usage through a multi-stage post-training process involving supervised fine-tuning and human feedback learning. GLM-4 outperforms GPT-4 in general metrics like MMLU, GSM8K, MATH, BBH, GPQA, and HumanEval, and matches GPT-4 Turbo in instruction following and long context tasks. The GLM-4 All Tools model is further aligned to understand user intent and autonomously decide when and which tools to use, such as web browsers, Python interpreters, and text-to-image models. It matches and surpasses GPT-4 All Tools in tasks like accessing online information and solving math problems. The models have been open-sourced, attracting over 10 million downloads on Hugging Face in 2023. The GLM family includes language, code, vision, and agent models, with the latest versions demonstrating strong performance in Chinese language tasks and aligning with state-of-the-art models like GPT-4 Turbo and Claude 3 Opus. The models are also evaluated on various benchmarks, showing strong performance in academic benchmarks, instruction following, alignment, long context handling, coding abilities, function calling, and agent tasks. The models are designed to be safe and responsible, with careful data cleaning and risk mitigation strategies. The team continues to develop more capable models, aiming to democratize cutting-edge LLM technologies through open sourcing.ChatGLM is a family of large language models developed by Zhipu AI and Tsinghua University, including GLM-4, GLM-4-Air, and GLM-4-9B. These models are trained on ten trillion tokens, primarily in Chinese and English, and are aligned for Chinese and English usage through a multi-stage post-training process involving supervised fine-tuning and human feedback learning. GLM-4 outperforms GPT-4 in general metrics like MMLU, GSM8K, MATH, BBH, GPQA, and HumanEval, and matches GPT-4 Turbo in instruction following and long context tasks. The GLM-4 All Tools model is further aligned to understand user intent and autonomously decide when and which tools to use, such as web browsers, Python interpreters, and text-to-image models. It matches and surpasses GPT-4 All Tools in tasks like accessing online information and solving math problems. The models have been open-sourced, attracting over 10 million downloads on Hugging Face in 2023. The GLM family includes language, code, vision, and agent models, with the latest versions demonstrating strong performance in Chinese language tasks and aligning with state-of-the-art models like GPT-4 Turbo and Claude 3 Opus. The models are also evaluated on various benchmarks, showing strong performance in academic benchmarks, instruction following, alignment, long context handling, coding abilities, function calling, and agent tasks. The models are designed to be safe and responsible, with careful data cleaning and risk mitigation strategies. The team continues to develop more capable models, aiming to democratize cutting-edge LLM technologies through open sourcing.