GLM: General Language Model Pretraining with Autoregressive Blank Infilling

GLM: General Language Model Pretraining with Autoregressive Blank Infilling

May 22-27, 2022 | Zhengxiao Du*1,2 Yujie Qian*3 Xiao Liu1,2 Ming Ding1,2 Jiezong Qiu1,2 Zhilin Yang1,4 Jie Tang1,2
The paper introduces a General Language Model (GLM) based on autoregressive blank infilling to address the limitations of existing pretraining frameworks such as BERT, GPT, and T5. GLM improves upon these models by adding 2D positional encodings and allowing an arbitrary order to predict spans, leading to better performance on natural language understanding (NLU) tasks. GLM can be pre-trained for different types of tasks by varying the number and lengths of blanks. Empirically, GLM outperforms BERT, T5, and RoBERTa on a wide range of tasks, including NLU, conditional and unconditional generation, with fewer parameters and data. The paper also discusses the multi-task pretraining setup, where GLM is trained with both blank infilling and document-level or sentence-level objectives, and evaluates its performance on various downstream tasks, demonstrating its effectiveness in handling different NLP tasks.The paper introduces a General Language Model (GLM) based on autoregressive blank infilling to address the limitations of existing pretraining frameworks such as BERT, GPT, and T5. GLM improves upon these models by adding 2D positional encodings and allowing an arbitrary order to predict spans, leading to better performance on natural language understanding (NLU) tasks. GLM can be pre-trained for different types of tasks by varying the number and lengths of blanks. Empirically, GLM outperforms BERT, T5, and RoBERTa on a wide range of tasks, including NLU, conditional and unconditional generation, with fewer parameters and data. The paper also discusses the multi-task pretraining setup, where GLM is trained with both blank infilling and document-level or sentence-level objectives, and evaluates its performance on various downstream tasks, demonstrating its effectiveness in handling different NLP tasks.
Reach us at info@study.space