2024 | Sanad Malaysha, Mo El-Haj, Saad Ezzini, Mohammed Khalilia, Mustafa Jarrar, Sultan Almujaiwel, Ismail Berrada, Houda Bouamor
The AraFinNLP2024 shared task aims to advance Arabic financial natural language processing (NLP) by addressing two key challenges: multi-dialect intent detection and cross-dialect translation and intent preservation. The task uses the ArBanking77 dataset, which includes 39,000 parallel queries in Modern Standard Arabic (MSA) and four dialects, each labeled with one or more of 77 banking intents. The task attracted 45 teams, with 11 actively participating in the test phase. Subtask 1, which focused on multi-dialect intent detection, saw 11 teams participate, while Subtask 2, which focused on cross-dialect translation and intent preservation, had only one team. The winning team in Subtask 1 achieved an F1 score of 0.8773, while the single team in Subtask 2 achieved a BLEU score of 1.667.
The task aimed to develop robust financial Arabic NLP tools, particularly in machine translation and banking chat-bots. The dataset was augmented with three additional dialects, including Moroccan, Saudi, and Tunisian, with careful translation and annotation processes. The results highlighted the effectiveness of various NLP models, including fine-tuned BERT-based models and ensemble methods. The MA team achieved the highest performance in Subtask 1, demonstrating the importance of model architecture and data augmentation in handling the complexities of Arabic dialects.
The shared task contributed to the broader goal of enhancing financial services through NLP, promoting inclusivity and efficiency in Arabic-speaking communities. It also highlighted the need for further research in Arabic financial NLP, particularly in addressing the challenges of dialectal variations and the nuances of financial communication. The task emphasized the importance of collaboration and innovation in advancing NLP technologies for financial applications.The AraFinNLP2024 shared task aims to advance Arabic financial natural language processing (NLP) by addressing two key challenges: multi-dialect intent detection and cross-dialect translation and intent preservation. The task uses the ArBanking77 dataset, which includes 39,000 parallel queries in Modern Standard Arabic (MSA) and four dialects, each labeled with one or more of 77 banking intents. The task attracted 45 teams, with 11 actively participating in the test phase. Subtask 1, which focused on multi-dialect intent detection, saw 11 teams participate, while Subtask 2, which focused on cross-dialect translation and intent preservation, had only one team. The winning team in Subtask 1 achieved an F1 score of 0.8773, while the single team in Subtask 2 achieved a BLEU score of 1.667.
The task aimed to develop robust financial Arabic NLP tools, particularly in machine translation and banking chat-bots. The dataset was augmented with three additional dialects, including Moroccan, Saudi, and Tunisian, with careful translation and annotation processes. The results highlighted the effectiveness of various NLP models, including fine-tuned BERT-based models and ensemble methods. The MA team achieved the highest performance in Subtask 1, demonstrating the importance of model architecture and data augmentation in handling the complexities of Arabic dialects.
The shared task contributed to the broader goal of enhancing financial services through NLP, promoting inclusivity and efficiency in Arabic-speaking communities. It also highlighted the need for further research in Arabic financial NLP, particularly in addressing the challenges of dialectal variations and the nuances of financial communication. The task emphasized the importance of collaboration and innovation in advancing NLP technologies for financial applications.