Understanding Automatic Labeling of Semantic Roles

The paper presents a system for identifying semantic roles in sentences, which are crucial for shallow semantic analysis and various natural language processing tasks. The system uses statistical classifiers trained on hand-annotated data from the FrameNet database, which defines a tagset of semantic roles called frame elements. The authors describe the FrameNet project, which proposes roles at the level of semantic frames, and the FrameNet corpus, which includes 50,000 sentences and 99,232 annotated frame elements. The methodology involves two subtasks: identifying the boundaries of frame elements and labeling each element with the correct role. Features used include phrase type, grammatical function, position, voice, and head word. The system achieves 80.4% accuracy on a development set and 79.6% on automatically identified frame elements. Lexical statistics, particularly those for head words, are found to be the most important features. The paper also discusses the importance of combining features and the potential for improving performance through larger and more representative data sets.The paper presents a system for identifying semantic roles in sentences, which are crucial for shallow semantic analysis and various natural language processing tasks. The system uses statistical classifiers trained on hand-annotated data from the FrameNet database, which defines a tagset of semantic roles called frame elements. The authors describe the FrameNet project, which proposes roles at the level of semantic frames, and the FrameNet corpus, which includes 50,000 sentences and 99,232 annotated frame elements. The methodology involves two subtasks: identifying the boundaries of frame elements and labeling each element with the correct role. Features used include phrase type, grammatical function, position, voice, and head word. The system achieves 80.4% accuracy on a development set and 79.6% on automatically identified frame elements. Lexical statistics, particularly those for head words, are found to be the most important features. The paper also discusses the importance of combining features and the potential for improving performance through larger and more representative data sets.

Automatic Labeling of Semantic Roles

| Daniel Gildea, Daniel Jurafsky