12 Jun 2024 | Zhibo Chu¹,², Shiwen Ni¹, Zichong Wang³, Xi Feng¹,², Min Yang¹, and Wenbin Zhang³
Large Language Models (LLMs) have evolved significantly over decades, progressing from statistical language models (SLMs) to neural language models (NLMs), pre-trained language models (PLMs), and finally to large language models (LLMs). SLMs use simple probability distributions to model word sequences, while NLMs employ neural networks to capture complex language patterns. PLMs leverage large-scale data and self-supervised learning to capture general linguistic knowledge, and LLMs extend this by using massive data, computation, and algorithms to create more expressive and adaptable models. LLMs, such as GPT, have achieved human-level text generation and are now widely used in various fields. However, their complexity and specialized language pose challenges for practitioners without relevant background knowledge. This survey aims to provide a comprehensive overview of LLMs, covering their history, development, principles, applications, limitations, and future directions. It emphasizes the importance of understanding LLMs for their effective utilization in both research and daily tasks. The survey highlights the rapid growth of LLMs driven by data diversity, computational advancements, and algorithmic innovations. It also discusses the challenges and ethical considerations associated with LLMs, including fairness, safety, and intellectual property issues. The survey concludes that while LLMs offer significant benefits, addressing their limitations is crucial for their responsible and effective use.Large Language Models (LLMs) have evolved significantly over decades, progressing from statistical language models (SLMs) to neural language models (NLMs), pre-trained language models (PLMs), and finally to large language models (LLMs). SLMs use simple probability distributions to model word sequences, while NLMs employ neural networks to capture complex language patterns. PLMs leverage large-scale data and self-supervised learning to capture general linguistic knowledge, and LLMs extend this by using massive data, computation, and algorithms to create more expressive and adaptable models. LLMs, such as GPT, have achieved human-level text generation and are now widely used in various fields. However, their complexity and specialized language pose challenges for practitioners without relevant background knowledge. This survey aims to provide a comprehensive overview of LLMs, covering their history, development, principles, applications, limitations, and future directions. It emphasizes the importance of understanding LLMs for their effective utilization in both research and daily tasks. The survey highlights the rapid growth of LLMs driven by data diversity, computational advancements, and algorithmic innovations. It also discusses the challenges and ethical considerations associated with LLMs, including fairness, safety, and intellectual property issues. The survey concludes that while LLMs offer significant benefits, addressing their limitations is crucial for their responsible and effective use.