6 Jul 2024 | Muhammad Abdul-Mageed, Amr Keleg, AbdelRahim Elmadany, Chiyu Zhang, Injy Hamed, Walid Magdy, Houda Bouamor, Nizar Habash
The fifth Nuanced Arabic Dialect Identification Shared Task (NADI 2024) aimed to advance Arabic NLP by providing datasets, modeling opportunities, and standardized evaluation conditions. The task included three subtasks: multi-label dialect identification (Subtask 1), Arabic level of dialectness estimation (Subtask 2), and dialect-to-Modern Standard Arabic (MSA) machine translation (Subtask 3). A total of 51 teams registered, with 12 participating in the test phase. The winning teams achieved 50.57 F1 on Subtask 1, 0.1403 RMSE on Subtask 2, and 20.44 BLEU on Subtask 3. The results highlight the challenges in processing dialectal Arabic, particularly in dialect identification and machine translation. The paper discusses the methods used by participating teams and provides an outlook for future NADI efforts.The fifth Nuanced Arabic Dialect Identification Shared Task (NADI 2024) aimed to advance Arabic NLP by providing datasets, modeling opportunities, and standardized evaluation conditions. The task included three subtasks: multi-label dialect identification (Subtask 1), Arabic level of dialectness estimation (Subtask 2), and dialect-to-Modern Standard Arabic (MSA) machine translation (Subtask 3). A total of 51 teams registered, with 12 participating in the test phase. The winning teams achieved 50.57 F1 on Subtask 1, 0.1403 RMSE on Subtask 2, and 20.44 BLEU on Subtask 3. The results highlight the challenges in processing dialectal Arabic, particularly in dialect identification and machine translation. The paper discusses the methods used by participating teams and provides an outlook for future NADI efforts.