15 Oct 2019 | Li Dong* Nan Yang* Wenhui Wang* Furu Wei*† Xiaodong Liu Yu Wang Jianfeng Gao Ming Zhou Hsiao-Wuen Hon
This paper introduces UNILM, a unified pre-trained language model that can be fine-tuned for both natural language understanding (NLU) and generation (NLG) tasks. UNILM is pre-trained using three types of language modeling tasks: unidirectional, bidirectional, and sequence-to-sequence prediction. The model uses a shared Transformer network and specific self-attention masks to control the context used for prediction. UNILM performs well on the GLUE benchmark and tasks like SQuAD 2.0 and CoQA. It also achieves state-of-the-art results on five NLG datasets, including abstractive summarization and question generation. The model is trained on large-scale text data and fine-tuned for various downstream tasks. UNILM's unified pre-training allows it to be used for both NLU and NLG tasks, and it outperforms previous models on several benchmarks. The model is implemented with a multi-layer Transformer architecture and is trained using a combination of different language modeling objectives. It is evaluated on various NLU and NLG tasks, including question answering, summarization, and dialog response generation. The results show that UNILM achieves high performance across multiple tasks, demonstrating its effectiveness as a versatile language model.This paper introduces UNILM, a unified pre-trained language model that can be fine-tuned for both natural language understanding (NLU) and generation (NLG) tasks. UNILM is pre-trained using three types of language modeling tasks: unidirectional, bidirectional, and sequence-to-sequence prediction. The model uses a shared Transformer network and specific self-attention masks to control the context used for prediction. UNILM performs well on the GLUE benchmark and tasks like SQuAD 2.0 and CoQA. It also achieves state-of-the-art results on five NLG datasets, including abstractive summarization and question generation. The model is trained on large-scale text data and fine-tuned for various downstream tasks. UNILM's unified pre-training allows it to be used for both NLU and NLG tasks, and it outperforms previous models on several benchmarks. The model is implemented with a multi-layer Transformer architecture and is trained using a combination of different language modeling objectives. It is evaluated on various NLU and NLG tasks, including question answering, summarization, and dialog response generation. The results show that UNILM achieves high performance across multiple tasks, demonstrating its effectiveness as a versatile language model.