A Systematic Comparison of Various Statistical Alignment Models

A Systematic Comparison of Various Statistical Alignment Models

2003 | Franz Josef Och*, Hermann Ney†
The paper presents and compares various methods for computing word alignments using statistical or heuristic models. It considers five alignment models from Brown et al. (1993), the hidden Markov alignment model, smoothing techniques, and refinements. These models are compared with two heuristic models based on the Dice coefficient. The paper also discusses methods for combining word alignments to symmetrize directed statistical alignment models. The quality of the resulting Viterbi alignment is evaluated against a manually produced reference alignment. The models are evaluated on the German-English Verbmobil task and the French-English Hansards task. The paper analyzes design decisions of the statistical alignment system and evaluates them on training corpora of various sizes. A key finding is that refined alignment models with first-order dependence and a fertility model yield significantly better results than simple heuristic models. The appendix includes an efficient training algorithm for the alignment models.The paper presents and compares various methods for computing word alignments using statistical or heuristic models. It considers five alignment models from Brown et al. (1993), the hidden Markov alignment model, smoothing techniques, and refinements. These models are compared with two heuristic models based on the Dice coefficient. The paper also discusses methods for combining word alignments to symmetrize directed statistical alignment models. The quality of the resulting Viterbi alignment is evaluated against a manually produced reference alignment. The models are evaluated on the German-English Verbmobil task and the French-English Hansards task. The paper analyzes design decisions of the statistical alignment system and evaluates them on training corpora of various sizes. A key finding is that refined alignment models with first-order dependence and a fertility model yield significantly better results than simple heuristic models. The appendix includes an efficient training algorithm for the alignment models.
Reach us at info@study.space