Understanding A Survey of Large Language Models in Finance (FinLLMs)

This survey provides a comprehensive overview of Financial Large Language Models (FinLLMs), including their history, techniques, performance, and opportunities and challenges. The paper first presents a chronological overview of general-domain Pretrained Language Models (PLMs) up to current FinLLMs, including the GPT series, selected open-source LLMs, and financial LMs. It then compares five techniques used across financial PLMs and FinLLMs, including training methods, training data, and fine-tuning methods. The paper summarizes the performance evaluations of six benchmark tasks and datasets. Additionally, it provides eight advanced financial NLP tasks and datasets for developing more sophisticated FinLLMs. Finally, it discusses the opportunities and challenges facing FinLLMs, such as hallucination, privacy, and efficiency. To support AI research in finance, the authors compile a collection of accessible datasets and evaluation benchmarks on GitHub. The paper discusses the evolution of FinLLMs from general-domain LMs, highlighting the development of four financial PLMs (FinPLMs) and four financial LLMs (FinLLMs). It reviews techniques used in FinPLMs and FinLLMs, including continual pre-training, domain-specific pre-training, mixed-domain pre-training, mixed-domain LLM with prompt engineering, and instruction fine-tuned LLM with prompt engineering. The paper evaluates six financial NLP benchmark tasks and datasets, including sentiment analysis, text classification, named entity recognition, question answering, stock movement prediction, and text summarization. It also presents eight advanced financial NLP tasks and datasets for further research. The paper discusses the opportunities and challenges of FinLLMs, including the need for high-quality financial data, the challenges of utilizing internal data without privacy breaches, and the need for appropriate financial evaluation metrics. The paper concludes that FinLLMs have significant potential in finance but require further research to address the challenges and opportunities in this domain.This survey provides a comprehensive overview of Financial Large Language Models (FinLLMs), including their history, techniques, performance, and opportunities and challenges. The paper first presents a chronological overview of general-domain Pretrained Language Models (PLMs) up to current FinLLMs, including the GPT series, selected open-source LLMs, and financial LMs. It then compares five techniques used across financial PLMs and FinLLMs, including training methods, training data, and fine-tuning methods. The paper summarizes the performance evaluations of six benchmark tasks and datasets. Additionally, it provides eight advanced financial NLP tasks and datasets for developing more sophisticated FinLLMs. Finally, it discusses the opportunities and challenges facing FinLLMs, such as hallucination, privacy, and efficiency. To support AI research in finance, the authors compile a collection of accessible datasets and evaluation benchmarks on GitHub. The paper discusses the evolution of FinLLMs from general-domain LMs, highlighting the development of four financial PLMs (FinPLMs) and four financial LLMs (FinLLMs). It reviews techniques used in FinPLMs and FinLLMs, including continual pre-training, domain-specific pre-training, mixed-domain pre-training, mixed-domain LLM with prompt engineering, and instruction fine-tuned LLM with prompt engineering. The paper evaluates six financial NLP benchmark tasks and datasets, including sentiment analysis, text classification, named entity recognition, question answering, stock movement prediction, and text summarization. It also presents eight advanced financial NLP tasks and datasets for further research. The paper discusses the opportunities and challenges of FinLLMs, including the need for high-quality financial data, the challenges of utilizing internal data without privacy breaches, and the need for appropriate financial evaluation metrics. The paper concludes that FinLLMs have significant potential in finance but require further research to address the challenges and opportunities in this domain.

A Survey of Large Language Models in Finance (FinLLMs)

4 Feb 2024 | Jean Lee, Nicholas Stevens, Soyeon Caren Han, Minseok Song