A Simple Rule-Based Part of Speech Tagger

A Simple Rule-Based Part of Speech Tagger

| Eric Brill
This paper presents a simple rule-based part-of-speech (POS) tagger that achieves accuracy comparable to stochastic taggers. The rule-based tagger has several advantages over stochastic taggers, including reduced storage requirements, a more transparent set of rules, ease of improvement, and better portability across different tag sets, genres, or languages. The tagger automatically acquires its rules by recognizing and correcting its weaknesses through a patch acquisition process. The initial tagger assigns each word its most likely tag based on a large tagged corpus, and then applies patches to improve performance. These patches are derived from the training corpus and are designed to reduce tagging errors. The tagger's performance is evaluated on the Brown Corpus, achieving an error rate of 5.1% with 71 patches. The results demonstrate that a simple rule-based tagger can perform as well as stochastic taggers, highlighting the potential for further exploration of rule-based methods in POS tagging.This paper presents a simple rule-based part-of-speech (POS) tagger that achieves accuracy comparable to stochastic taggers. The rule-based tagger has several advantages over stochastic taggers, including reduced storage requirements, a more transparent set of rules, ease of improvement, and better portability across different tag sets, genres, or languages. The tagger automatically acquires its rules by recognizing and correcting its weaknesses through a patch acquisition process. The initial tagger assigns each word its most likely tag based on a large tagged corpus, and then applies patches to improve performance. These patches are derived from the training corpus and are designed to reduce tagging errors. The tagger's performance is evaluated on the Brown Corpus, achieving an error rate of 5.1% with 71 patches. The results demonstrate that a simple rule-based tagger can perform as well as stochastic taggers, highlighting the potential for further exploration of rule-based methods in POS tagging.
Reach us at info@study.space