Recent Advances in Natural Language Processing via Large Pre-Trained Language Models: A Survey

Recent Advances in Natural Language Processing via Large Pre-Trained Language Models: A Survey

1 Nov 2021 | Bonan Min*, Hayley Ross*, Elior Sulem*, Amir Pouran Ben Veyseh*, Thien Huu Nguyen*, Oscar Sainz*, Eneko Agirre*, Ilana Heinz*, and Dan Roth*
This survey explores recent advances in Natural Language Processing (NLP) using large pre-trained language models (PLMs). The paper discusses three main paradigms: pre-training then fine-tuning, prompt-based learning, and NLP as text generation. It also covers data generation using PLMs for NLP tasks. Large PLMs, such as BERT and GPT, have transformed NLP by enabling efficient learning of complex representations. Pre-training then fine-tuning involves training a large model on a general task (e.g., language modeling) and then adapting it for specific tasks. Prompt-based learning uses prompts to guide PLMs to perform tasks by aligning them with pre-training objectives. NLP as text generation reformulates tasks into text generation problems, leveraging the generative capabilities of models like GPT-2 and T5. The paper also discusses data generation using PLMs to create training data, such as silver-labeled data or auxiliary data. It highlights the importance of pre-training corpus size and quality, noting that larger and more diverse datasets often lead to better performance. However, the quality of data can also play a significant role, especially when the task matches the data's domain. Fine-tuning PLMs for NLP tasks involves adjusting model parameters to adapt to specific tasks. This can be done by fine-tuning the entire model or only specific layers. Efficient fine-tuning approaches, such as adapter modules and BitFit, aim to minimize computational costs while maintaining performance. These methods allow for effective weight sharing and efficient training. Prompt-based learning offers advantages such as reduced computational requirements and better alignment with pre-training objectives. It enables few-shot learning, especially for tasks with limited data. Template-based learning reformulates tasks into language modeling tasks, reducing the need for task-specific training data. Templates can be manually crafted or automatically generated, with the latter often involving paraphrasing or gradient-based search. The paper also discusses multi-prompt learning, where multiple prompts are combined to improve task performance. This includes prompt ensembling, augmentation, and decomposition/composition. Knowledge distillation is used to transfer knowledge from multiple models to a single model, enhancing performance on unlabeled data. In conclusion, large pre-trained language models have revolutionized NLP by enabling efficient learning and adaptation to various tasks. The survey highlights the importance of pre-training corpus size, quality, and the effectiveness of different fine-tuning and prompt-based approaches in achieving state-of-the-art results. Future research directions include improving model efficiency, enhancing few-shot learning capabilities, and exploring new paradigms for NLP tasks.This survey explores recent advances in Natural Language Processing (NLP) using large pre-trained language models (PLMs). The paper discusses three main paradigms: pre-training then fine-tuning, prompt-based learning, and NLP as text generation. It also covers data generation using PLMs for NLP tasks. Large PLMs, such as BERT and GPT, have transformed NLP by enabling efficient learning of complex representations. Pre-training then fine-tuning involves training a large model on a general task (e.g., language modeling) and then adapting it for specific tasks. Prompt-based learning uses prompts to guide PLMs to perform tasks by aligning them with pre-training objectives. NLP as text generation reformulates tasks into text generation problems, leveraging the generative capabilities of models like GPT-2 and T5. The paper also discusses data generation using PLMs to create training data, such as silver-labeled data or auxiliary data. It highlights the importance of pre-training corpus size and quality, noting that larger and more diverse datasets often lead to better performance. However, the quality of data can also play a significant role, especially when the task matches the data's domain. Fine-tuning PLMs for NLP tasks involves adjusting model parameters to adapt to specific tasks. This can be done by fine-tuning the entire model or only specific layers. Efficient fine-tuning approaches, such as adapter modules and BitFit, aim to minimize computational costs while maintaining performance. These methods allow for effective weight sharing and efficient training. Prompt-based learning offers advantages such as reduced computational requirements and better alignment with pre-training objectives. It enables few-shot learning, especially for tasks with limited data. Template-based learning reformulates tasks into language modeling tasks, reducing the need for task-specific training data. Templates can be manually crafted or automatically generated, with the latter often involving paraphrasing or gradient-based search. The paper also discusses multi-prompt learning, where multiple prompts are combined to improve task performance. This includes prompt ensembling, augmentation, and decomposition/composition. Knowledge distillation is used to transfer knowledge from multiple models to a single model, enhancing performance on unlabeled data. In conclusion, large pre-trained language models have revolutionized NLP by enabling efficient learning and adaptation to various tasks. The survey highlights the importance of pre-training corpus size, quality, and the effectiveness of different fine-tuning and prompt-based approaches in achieving state-of-the-art results. Future research directions include improving model efficiency, enhancing few-shot learning capabilities, and exploring new paradigms for NLP tasks.
Reach us at info@study.space