Understanding Linear-time Minimum Bayes Risk Decoding with Reference Aggregation

The paper "Linear-time Minimum Bayes Risk Decoding with Reference Aggregation" by Jannis Vamvas and Rico Sennrich addresses the efficiency issue in Minimum Bayes Risk (MBR) decoding, a technique used to improve the quality of machine translations. MBR decoding involves sampling many hypotheses and calculating their expected utility using Monte Carlo (MC) sampling, which is computationally expensive due to quadratic complexity. The authors propose a method called *reference aggregation* to approximate pairwise metric scores with scores calculated against aggregated reference representations, reducing the complexity from $O(n^2)$ to $O(n)$ while preserving most of the quality gains of MBR decoding. The key idea is to combine the representations of multiple references into an aggregate reference representation and then use this aggregate representation for utility estimation. This approach leverages the fact that common metrics, such as CHRF and COMET, represent text sequences in averageable form. For CHRF, the n-gram statistics can be averaged, and for COMET, sentence embeddings can be averaged. The authors report that reference aggregation reduces the time needed for computing the utility of 1024 samples by 99.5% for CHRF and by 95-99% for COMET, significantly improving efficiency without significantly affecting translation quality. The paper also includes experimental results on four translation directions using two utility metrics: CHRF and COMET. The results show that reference aggregation maintains near-perfect top-20 accuracy for CHRF and outperforms other efficiency methods for COMET, making MBR more practical for large-scale applications. The authors discuss the limitations of their approach, which include the requirement that the utility metric must be based on averageable representations and the need for empirical evaluation of the effectiveness of aggregation for trained metrics.The paper "Linear-time Minimum Bayes Risk Decoding with Reference Aggregation" by Jannis Vamvas and Rico Sennrich addresses the efficiency issue in Minimum Bayes Risk (MBR) decoding, a technique used to improve the quality of machine translations. MBR decoding involves sampling many hypotheses and calculating their expected utility using Monte Carlo (MC) sampling, which is computationally expensive due to quadratic complexity. The authors propose a method called *reference aggregation* to approximate pairwise metric scores with scores calculated against aggregated reference representations, reducing the complexity from $O(n^2)$ to $O(n)$ while preserving most of the quality gains of MBR decoding. The key idea is to combine the representations of multiple references into an aggregate reference representation and then use this aggregate representation for utility estimation. This approach leverages the fact that common metrics, such as CHRF and COMET, represent text sequences in averageable form. For CHRF, the n-gram statistics can be averaged, and for COMET, sentence embeddings can be averaged. The authors report that reference aggregation reduces the time needed for computing the utility of 1024 samples by 99.5% for CHRF and by 95-99% for COMET, significantly improving efficiency without significantly affecting translation quality. The paper also includes experimental results on four translation directions using two utility metrics: CHRF and COMET. The results show that reference aggregation maintains near-perfect top-20 accuracy for CHRF and outperforms other efficiency methods for COMET, making MBR more practical for large-scale applications. The authors discuss the limitations of their approach, which include the requirement that the utility metric must be based on averageable representations and the need for empirical evaluation of the effectiveness of aggregation for trained metrics.

Linear-time Minimum Bayes Risk Decoding with Reference Aggregation

3 Jun 2024 | Jannis Vamvas and Rico Sennrich