5 Jan 2024 | Gabriel Lino Garcia, Pedro Henrique Paiola, Luis Henrique Morelli, Giovani Candido, Arnaldo Cândido Júnior, Danilo Samuel Jodas, Luis C. S. Afonso, Ivan Rizzo Guilherme, Bruno Elias Penteado, João Paulo Papa
This paper introduces Bode, a fine-tuned large language model (LLM) based on the LLaMA 2 architecture for Portuguese prompt-based tasks. The authors address the challenge of developing LLMs for low-resource languages like Portuguese, which often struggle with code switching and other issues when trained on multilingual datasets. Bode is available in two versions, 7B and 13B, and is evaluated using zero-shot and in-context learning approaches on classification tasks. The main contributions of the work include the development of an LLM that performs well in Portuguese and is freely available for research and commercial use. The paper also reviews related works, including Sabiá and openCabrita, and compares Bode with other state-of-the-art LLMs. Experimental results show that Bode outperforms or matches the performance of other models in sentiment analysis, news category classification, and fake news detection tasks. The authors conclude by highlighting the potential of Bode to advance NLP applications in Portuguese and the need for further development and refinement.This paper introduces Bode, a fine-tuned large language model (LLM) based on the LLaMA 2 architecture for Portuguese prompt-based tasks. The authors address the challenge of developing LLMs for low-resource languages like Portuguese, which often struggle with code switching and other issues when trained on multilingual datasets. Bode is available in two versions, 7B and 13B, and is evaluated using zero-shot and in-context learning approaches on classification tasks. The main contributions of the work include the development of an LLM that performs well in Portuguese and is freely available for research and commercial use. The paper also reviews related works, including Sabiá and openCabrita, and compares Bode with other state-of-the-art LLMs. Experimental results show that Bode outperforms or matches the performance of other models in sentiment analysis, news category classification, and fake news detection tasks. The authors conclude by highlighting the potential of Bode to advance NLP applications in Portuguese and the need for further development and refinement.