A Hierarchical Phrase-Based Model for Statistical Machine Translation

A Hierarchical Phrase-Based Model for Statistical Machine Translation

June 2005 | David Chiang
This paper presents a hierarchical phrase-based statistical machine translation (SMT) model that improves upon traditional phrase-based systems by using hierarchical phrases—phrases that contain subphrases. The model is formally a synchronous context-free grammar (CFG), but is learned from bilingual text without any syntactic information. This approach allows the model to capture larger translation scopes beyond simple word-level reorderings, which is essential for handling complex syntactic structures. The hierarchical phrase-based model outperforms the state-of-the-art phrase-based system Pharaoh by 7.5% in BLEU score. The model uses a weighted synchronous CFG, where hierarchical phrases are represented as productions of the grammar. These hierarchical phrases allow the model to capture syntactic relationships between phrases, such as the modification of verb phrases by prepositional phrases in Chinese and English. The model is trained using a log-linear approach, with features including phrase translation probabilities, lexical weights, and penalties for phrase length. The training process involves extracting initial phrase pairs from word-aligned corpora and generating a large set of rules. These rules are then filtered to balance grammar size and performance on the development set. The decoder uses a CKY parser with beam search and a postprocessor to map French derivations to English derivations. It prunes the search space to improve efficiency, and limits the maximum span of phrases to 10 words, corresponding to the maximum length of initial rules during training. Experiments on Mandarin-to-English translation show that the hierarchical model achieves a 7.5% relative improvement over Pharaoh, with statistically significant results. The model also demonstrates the potential to incorporate syntactic information, although this did not lead to significant improvements in test performance. The paper concludes that hierarchical phrase-based models can significantly improve translation accuracy without syntactic annotations, and that future work should focus on more syntactically motivated grammars and efficient training methods. The model's design philosophy emphasizes incorporating syntax into statistical translation without compromising the strengths of the phrase-based approach.This paper presents a hierarchical phrase-based statistical machine translation (SMT) model that improves upon traditional phrase-based systems by using hierarchical phrases—phrases that contain subphrases. The model is formally a synchronous context-free grammar (CFG), but is learned from bilingual text without any syntactic information. This approach allows the model to capture larger translation scopes beyond simple word-level reorderings, which is essential for handling complex syntactic structures. The hierarchical phrase-based model outperforms the state-of-the-art phrase-based system Pharaoh by 7.5% in BLEU score. The model uses a weighted synchronous CFG, where hierarchical phrases are represented as productions of the grammar. These hierarchical phrases allow the model to capture syntactic relationships between phrases, such as the modification of verb phrases by prepositional phrases in Chinese and English. The model is trained using a log-linear approach, with features including phrase translation probabilities, lexical weights, and penalties for phrase length. The training process involves extracting initial phrase pairs from word-aligned corpora and generating a large set of rules. These rules are then filtered to balance grammar size and performance on the development set. The decoder uses a CKY parser with beam search and a postprocessor to map French derivations to English derivations. It prunes the search space to improve efficiency, and limits the maximum span of phrases to 10 words, corresponding to the maximum length of initial rules during training. Experiments on Mandarin-to-English translation show that the hierarchical model achieves a 7.5% relative improvement over Pharaoh, with statistically significant results. The model also demonstrates the potential to incorporate syntactic information, although this did not lead to significant improvements in test performance. The paper concludes that hierarchical phrase-based models can significantly improve translation accuracy without syntactic annotations, and that future work should focus on more syntactically motivated grammars and efficient training methods. The model's design philosophy emphasizes incorporating syntax into statistical translation without compromising the strengths of the phrase-based approach.
Reach us at info@study.space
[slides and audio] A Hierarchical Phrase-Based Model for Statistical Machine Translation