[slides] Unified Language Model Pre-training for Natural Language Understanding and Generation

This paper introduces a new unified pre-trained language model (UNiLM) that can be fine-tuned for both natural language understanding (NLU) and natural language generation (NLG) tasks. UNiLM is pre-trained using three types of language modeling tasks: unidirectional, bidirectional, and sequence-to-sequence prediction. The model employs a shared Transformer network and specific self-attention masks to control the context used for prediction. UNiLM outperforms BERT on the GLUE benchmark and question answering tasks (SQuAD 2.0 and CoQA). Additionally, it achieves state-of-the-art results on five NLG datasets, including abstractive summarization, question generation, generative question answering, and dialog response generation. The code and pre-trained models are available at <https://github.com/microsoft/unilm>.This paper introduces a new unified pre-trained language model (UNiLM) that can be fine-tuned for both natural language understanding (NLU) and natural language generation (NLG) tasks. UNiLM is pre-trained using three types of language modeling tasks: unidirectional, bidirectional, and sequence-to-sequence prediction. The model employs a shared Transformer network and specific self-attention masks to control the context used for prediction. UNiLM outperforms BERT on the GLUE benchmark and question answering tasks (SQuAD 2.0 and CoQA). Additionally, it achieves state-of-the-art results on five NLG datasets, including abstractive summarization, question generation, generative question answering, and dialog response generation. The code and pre-trained models are available at <https://github.com/microsoft/unilm>.

Unified Language Model Pre-training for Natural Language Understanding and Generation

15 Oct 2019 | Li Dong* Nan Yang* Wenhui Wang* Furu Wei*† Xiaodong Liu Yu Wang Jianfeng Gao Ming Zhou Hsiao-Wuen Hon