Context-Aware Monolingual Repair for Neural Machine Translation

E. Voita; R. Sennrich; I. Titov

doi:https://doi.org/10.18653/v1/D19-1081

Context-Aware Monolingual Repair for Neural Machine Translation

Authors	E. Voita R. Sennrich I. Titov
Publication date	2019
Host editors	K. Inui J. Jiang V. Ng X. Wan
Book title	2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing
Book subtitle	EMNLP-IJCNLP 2019 : proceedings of the conference : November 3-7, 2019, Hong Kong, China
ISBN (electronic)	9781950737901
Event	2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing
Pages (from-to)	877-886
Publisher	Stroudsburg, PA: The Association for Computational Linguistics
Organisations	Faculty of Science (FNWI) Interfacultary Research - Institute for Logic, Language and Computation (ILLC)
Abstract	Modern sentence-level NMT systems often produce plausible translations of isolated sentences. However, when put in context, these translations may end up being inconsistent with each other. We propose a monolingual DocRepair model to correct inconsistencies between sentence-level translations. DocRepair performs automatic post-editing on a sequence of sentence-level translations, refining translations of sentences in context of each other. For training, the DocRepair model requires only monolingual document-level data in the target language. It is trained as a monolingual sequence-to-sequence model that maps inconsistent groups of sentences into consistent ones. The consistent groups come from the original training data; the inconsistent groups are obtained by sampling round-trip translations for each isolated sentence. We show that this approach successfully imitates inconsistencies we aim to fix: using contrastive evaluation, we show large improvements in the translation of several contextual phenomena in an English-Russian translation task, as well as improvements in the BLEU score. We also conduct a human evaluation and show a strong preference of the annotators to corrected translations over the baseline ones. Moreover, we analyze which discourse phenomena are hard to capture using monolingual data only.
Document type	Conference contribution
Language	English
Published at	https://doi.org/10.18653/v1/D19-1081
Other links	https://github.com/lena-voita/good-translation-wrong-in-context
Downloads	D19-1081 (Final published version)
Permalink to this page

Back

UvA-DARE

Digital Academic Repository

Context-Aware Monolingual Repair for Neural Machine Translation