This paper presents a design framework for financial sentiment analysis (FSA) using heterogeneous large language models (LLMs) without fine-tuning. The framework, called Heterogeneous Agent Discussion (HAD), is based on Minsky's theory of mind and emotions, which posits that emotional states are the result of specific resource activations. The design involves specialized LLM agents that focus on different types of FSA errors, such as mood, rhetoric, dependency, aspect, and reference. These agents work together to analyze financial text and produce a collective sentiment prediction.
The HAD framework is evaluated on five FSA datasets, showing improved accuracy and F-1 scores compared to naive prompting and fine-tuning approaches. The results indicate that the framework can improve FSA performance by up to 9.46% in accuracy and 13.72% in F-1 scores. The framework also demonstrates the effectiveness of using error types to guide agent design, with mood, rhetoric, and aspect agents being the most important contributors to performance.
The study contributes to the design science literature by presenting a kernel theory-informed design artifact. It also has implications for emotion theory, LLM collaboration research, and financial decision-making practices. The findings suggest that HAD can be used to build more effective FSA systems, and that the framework can be further optimized with more empirical evidence. The study also highlights the importance of considering the unique challenges of FSA, such as external references and world knowledge, and calls for more research on this important task.This paper presents a design framework for financial sentiment analysis (FSA) using heterogeneous large language models (LLMs) without fine-tuning. The framework, called Heterogeneous Agent Discussion (HAD), is based on Minsky's theory of mind and emotions, which posits that emotional states are the result of specific resource activations. The design involves specialized LLM agents that focus on different types of FSA errors, such as mood, rhetoric, dependency, aspect, and reference. These agents work together to analyze financial text and produce a collective sentiment prediction.
The HAD framework is evaluated on five FSA datasets, showing improved accuracy and F-1 scores compared to naive prompting and fine-tuning approaches. The results indicate that the framework can improve FSA performance by up to 9.46% in accuracy and 13.72% in F-1 scores. The framework also demonstrates the effectiveness of using error types to guide agent design, with mood, rhetoric, and aspect agents being the most important contributors to performance.
The study contributes to the design science literature by presenting a kernel theory-informed design artifact. It also has implications for emotion theory, LLM collaboration research, and financial decision-making practices. The findings suggest that HAD can be used to build more effective FSA systems, and that the framework can be further optimized with more empirical evidence. The study also highlights the importance of considering the unique challenges of FSA, such as external references and world knowledge, and calls for more research on this important task.