22 Feb 2024 | Junjie Ye, Nuo Xu, Yikun Wang, Jie Zhou, Qi Zhang, Tao Gui, Xuanjing Huang
LLM-DA is a novel data augmentation method for few-shot Named Entity Recognition (NER) that leverages the rewriting capabilities and world knowledge of large language models (LLMs). The method addresses the limitations of existing data augmentation techniques by generating semantically coherent sentences while maintaining the diversity and quality of the data. It employs 14 contextual rewriting strategies across four dimensions, entity-level replacement with same-type entities, and noise injection to enhance robustness. The approach also considers both context and entity levels for augmentation, ensuring that the generated data aligns with the inherent characteristics of the NER task. Extensive experiments demonstrate that LLM-DA significantly improves NER model performance with limited data, outperforming existing methods in terms of data quality and effectiveness. The method is evaluated on four NER datasets, including CoNLL'03, OntoNotes 5.0, MIT-Movie, and FEW-NERD. Results show that LLM-DA consistently produces higher quality data compared to other approaches, particularly in scenarios with limited training data. Additionally, the method is effective in both low-resource and domain-specific settings. The paper also discusses the limitations of the approach, including the token length restrictions of LLMs and the need for further research to expand entity generation. Overall, LLM-DA provides a valuable contribution to improving model performance in few-shot NER tasks by leveraging the strengths of LLMs while addressing their limitations.LLM-DA is a novel data augmentation method for few-shot Named Entity Recognition (NER) that leverages the rewriting capabilities and world knowledge of large language models (LLMs). The method addresses the limitations of existing data augmentation techniques by generating semantically coherent sentences while maintaining the diversity and quality of the data. It employs 14 contextual rewriting strategies across four dimensions, entity-level replacement with same-type entities, and noise injection to enhance robustness. The approach also considers both context and entity levels for augmentation, ensuring that the generated data aligns with the inherent characteristics of the NER task. Extensive experiments demonstrate that LLM-DA significantly improves NER model performance with limited data, outperforming existing methods in terms of data quality and effectiveness. The method is evaluated on four NER datasets, including CoNLL'03, OntoNotes 5.0, MIT-Movie, and FEW-NERD. Results show that LLM-DA consistently produces higher quality data compared to other approaches, particularly in scenarios with limited training data. Additionally, the method is effective in both low-resource and domain-specific settings. The paper also discusses the limitations of the approach, including the token length restrictions of LLMs and the need for further research to expand entity generation. Overall, LLM-DA provides a valuable contribution to improving model performance in few-shot NER tasks by leveraging the strengths of LLMs while addressing their limitations.