26 Jan 2024 | Pau Rodriguez Inserte, Mariam Nakhle, Raheel Qader, Gaëtan Caillaud, Jingshu Liu
This paper presents a study on adapting large language models (LLMs) for financial sentiment analysis. The research focuses on two foundation models with less than 1.5B parameters, which are adapted using various strategies. The study shows that these models can be effectively fine-tuned on financial documents and instructions to perform well in financial sentiment analysis. Additionally, the paper demonstrates how to generate artificial instructions using LLMs to augment the instruction dataset. The study also evaluates the performance of these models on various financial tasks, including sentiment analysis, news headline classification, and named entity recognition. The results show that the adapted models achieve similar or higher performance compared to larger models, while being more efficient in terms of parameters and data. The paper also introduces a curated dataset consisting of financial documents and instructions for financial tasks. The findings suggest that domain adaptation is crucial for improving the performance of LLMs on financial tasks, and that smaller models can be as effective as larger ones when properly fine-tuned. The study also highlights the importance of data augmentation in improving the performance of LLMs on financial tasks. The results indicate that the proposed methods are effective for financial sentiment analysis and other financial NLP tasks. The paper concludes that the models fine-tuned in this study outperform most LLMs in financial tasks, with the exception of FinMA models. The study also shows that the models used are more efficient than other models, with significantly fewer parameters. The paper also introduces a strategy for generating samples for the instruction dataset, and describes the two datasets used in the study in sufficient detail to allow for reproducibility. The study provides a comprehensive analysis of the state-of-the-art in financial sentiment analysis, ranging from traditional dictionary-based methods to recent advancements in LLMs. The paper also discusses the limitations of the study, including the generative capabilities of the models, the focus on certain tasks, and the potential for further research with larger models. The study concludes that the proposed methods are effective for financial sentiment analysis and other financial NLP tasks, and that the models fine-tuned in this study are a promising solution for financial tasks.This paper presents a study on adapting large language models (LLMs) for financial sentiment analysis. The research focuses on two foundation models with less than 1.5B parameters, which are adapted using various strategies. The study shows that these models can be effectively fine-tuned on financial documents and instructions to perform well in financial sentiment analysis. Additionally, the paper demonstrates how to generate artificial instructions using LLMs to augment the instruction dataset. The study also evaluates the performance of these models on various financial tasks, including sentiment analysis, news headline classification, and named entity recognition. The results show that the adapted models achieve similar or higher performance compared to larger models, while being more efficient in terms of parameters and data. The paper also introduces a curated dataset consisting of financial documents and instructions for financial tasks. The findings suggest that domain adaptation is crucial for improving the performance of LLMs on financial tasks, and that smaller models can be as effective as larger ones when properly fine-tuned. The study also highlights the importance of data augmentation in improving the performance of LLMs on financial tasks. The results indicate that the proposed methods are effective for financial sentiment analysis and other financial NLP tasks. The paper concludes that the models fine-tuned in this study outperform most LLMs in financial tasks, with the exception of FinMA models. The study also shows that the models used are more efficient than other models, with significantly fewer parameters. The paper also introduces a strategy for generating samples for the instruction dataset, and describes the two datasets used in the study in sufficient detail to allow for reproducibility. The study provides a comprehensive analysis of the state-of-the-art in financial sentiment analysis, ranging from traditional dictionary-based methods to recent advancements in LLMs. The paper also discusses the limitations of the study, including the generative capabilities of the models, the focus on certain tasks, and the potential for further research with larger models. The study concludes that the proposed methods are effective for financial sentiment analysis and other financial NLP tasks, and that the models fine-tuned in this study are a promising solution for financial tasks.