[slides and audio] Recurrent Continuous Translation Models

The paper introduces Recurrent Continuous Translation Models (RCTMs), a class of probabilistic continuous translation models that do not rely on alignments or phrasal translation units. RCTMs consist of a generation aspect modeled with a target Recurrent Language Model (RLM) and a conditioning aspect modeled with a Convolutional Sentence Model (CSM). The RLM generates the translation, while the CSM conditions the RLM on the source sentence. Two specific RCTM architectures are defined: RCTM I uses the CSM to transform source word representations into a representation for the source sentence, which then constraints the generation of each target word. RCTM II introduces an intermediate representation, using a truncated CSM to transform source word representations into representations for target words, which then constrain the generation of the target sentence. The models are evaluated through four experiments, showing that they achieve significantly lower perplexity than state-of-the-art alignment-based models, are sensitive to word order, syntax, and meaning in the source sentence, and perform well in rescoring n-best lists of translations.The paper introduces Recurrent Continuous Translation Models (RCTMs), a class of probabilistic continuous translation models that do not rely on alignments or phrasal translation units. RCTMs consist of a generation aspect modeled with a target Recurrent Language Model (RLM) and a conditioning aspect modeled with a Convolutional Sentence Model (CSM). The RLM generates the translation, while the CSM conditions the RLM on the source sentence. Two specific RCTM architectures are defined: RCTM I uses the CSM to transform source word representations into a representation for the source sentence, which then constraints the generation of each target word. RCTM II introduces an intermediate representation, using a truncated CSM to transform source word representations into representations for target words, which then constrain the generation of the target sentence. The models are evaluated through four experiments, showing that they achieve significantly lower perplexity than state-of-the-art alignment-based models, are sensitive to word order, syntax, and meaning in the source sentence, and perform well in rescoring n-best lists of translations.

Recurrent Continuous Translation Models

18-21 October 2013 | Nal Kalchbrenner, Phil Blunsom