12 Mar 2024 | Dan Zhang, Ziniu Hu, Sining Zhoubian, Zhengxiao Du, Kaiyu Yang, Zihan Wang, Yisong Yue, Yuxiao Dong, Jie Tang
SciGLM is a scientific language model designed to enhance scientific reasoning capabilities through self-reflective instruction annotation and tuning. The model is trained using SciInstruct, a comprehensive dataset containing physics, chemistry, math, and formal proofs. The dataset was created through a self-reflective annotation framework that generates step-by-step reasoning for unlabelled scientific questions and then self-critiques and revises the outputs. This process improves the model's ability to solve complex scientific problems, with SciGLM showing a 4.87% improvement over the base model (ChatGLM3-6B-Base) and a 2.67% improvement over the 32B model. The model maintains strong language understanding capabilities, making it suitable for various scientific discovery tasks. SciInstruct and SciGLM are released for the research community, along with the self-reflective framework and fine-tuning code. The study highlights the importance of diverse training data and self-annotation in enhancing general reasoning capabilities, even in challenging domains like science. The research also explores the impact of data scaling and quality on model performance, showing that high-quality and diverse reasoning data can lead to better results. The model's performance is evaluated on various scientific and mathematical benchmarks, demonstrating its effectiveness in solving complex problems. The study contributes to the development of scientific language models by providing a new approach to instruction dataset creation and model training.SciGLM is a scientific language model designed to enhance scientific reasoning capabilities through self-reflective instruction annotation and tuning. The model is trained using SciInstruct, a comprehensive dataset containing physics, chemistry, math, and formal proofs. The dataset was created through a self-reflective annotation framework that generates step-by-step reasoning for unlabelled scientific questions and then self-critiques and revises the outputs. This process improves the model's ability to solve complex scientific problems, with SciGLM showing a 4.87% improvement over the base model (ChatGLM3-6B-Base) and a 2.67% improvement over the 32B model. The model maintains strong language understanding capabilities, making it suitable for various scientific discovery tasks. SciInstruct and SciGLM are released for the research community, along with the self-reflective framework and fine-tuning code. The study highlights the importance of diverse training data and self-annotation in enhancing general reasoning capabilities, even in challenging domains like science. The research also explores the impact of data scaling and quality on model performance, showing that high-quality and diverse reasoning data can lead to better results. The model's performance is evaluated on various scientific and mathematical benchmarks, demonstrating its effectiveness in solving complex problems. The study contributes to the development of scientific language models by providing a new approach to instruction dataset creation and model training.