Look Harder: A Neural Machine Translation Model with Hard Attention
Published
Association for Computational Linguistics (ACL)
Abstract
Soft-attention based Neural Machine Translation (NMT) models have achieved promising results on several translation tasks. However, they are not effective on long sequences. nthis work, we propose a hard-attention based NMT model which selects a subset of source tokens for each target token to effectively handle long sequence translation. Due to discrete nature of hard-attention mechanism, we design a reinforcement learning algorithm coupled with reward shaping strategy to efficiently train it. We demonstrate the effectiveness of the proposed model on English-German (ENDE) and English-French (EN-FR) machine translation tasks. The proposed model sets a new state-of-the-art performance on EN-DE and EN-FR tasks by obtaining 31.75 (3:35 ') and 42.26 (1:26 ') BLEU points respectively.