13 Jul 2024 | Mustafa Jarrar, Nagham Hamad, Mohammed Khalilia, Bashar Talafha, AbdelRahim Elmadany, Muhammad Abdul-Mageed
The WojoodNER 2024 shared task focused on fine-grained Arabic Named Entity Recognition (NER), introducing a new dataset called Wojood_Fine with subtypes of entities. Three subtasks were conducted: Closed-Track Flat Fine-Grained NER, Closed-Track Nested Fine-Grained NER, and Open-Track NER for the Israeli War on Gaza. A total of 43 teams registered, with five participating in the Flat Fine-Grained Subtask, two in the Nested Fine-Grained Subtask, and one in the Open-Track NER Subtask. The winning teams achieved F1 scores of 91% and 92% in the Flat and Nested subtasks, respectively, while the sole Open-Track team scored 73.7%. The task aimed to improve Arabic NER research by introducing more granular entity types and supporting nested and fine-grained annotations. The Wojood_Fine dataset includes 550k tokens and 51 entity types, with annotations covering 47k subtype mentions. The Open-Track subtask used a dataset related to the Israeli War on Gaza, allowing teams to use external resources and generative models. The task emphasized fairness and transparency, with strict rules against using external data for the first two subtasks. Teams submitted models in CoNLL format, and results were evaluated using micro-averaged F1 scores. The shared task highlighted the effectiveness of language models in NER, with several teams achieving high performance. The task also aimed to expand the Wojood_Fine corpus to include more dialects and improve Arabic NER research. Limitations included the focus on Modern Standard Arabic and limited dialect coverage. The task provided valuable insights into Arabic NER and encouraged further research in this area.The WojoodNER 2024 shared task focused on fine-grained Arabic Named Entity Recognition (NER), introducing a new dataset called Wojood_Fine with subtypes of entities. Three subtasks were conducted: Closed-Track Flat Fine-Grained NER, Closed-Track Nested Fine-Grained NER, and Open-Track NER for the Israeli War on Gaza. A total of 43 teams registered, with five participating in the Flat Fine-Grained Subtask, two in the Nested Fine-Grained Subtask, and one in the Open-Track NER Subtask. The winning teams achieved F1 scores of 91% and 92% in the Flat and Nested subtasks, respectively, while the sole Open-Track team scored 73.7%. The task aimed to improve Arabic NER research by introducing more granular entity types and supporting nested and fine-grained annotations. The Wojood_Fine dataset includes 550k tokens and 51 entity types, with annotations covering 47k subtype mentions. The Open-Track subtask used a dataset related to the Israeli War on Gaza, allowing teams to use external resources and generative models. The task emphasized fairness and transparency, with strict rules against using external data for the first two subtasks. Teams submitted models in CoNLL format, and results were evaluated using micro-averaged F1 scores. The shared task highlighted the effectiveness of language models in NER, with several teams achieving high performance. The task also aimed to expand the Wojood_Fine corpus to include more dialects and improve Arabic NER research. Limitations included the focus on Modern Standard Arabic and limited dialect coverage. The task provided valuable insights into Arabic NER and encouraged further research in this area.