Context-aware neural machine translation learns anaphora resolution

Open Access
Authors
Publication date 2018
Host editors
  • I. Gurevych
  • Y. Miyao
Book title ACL 2018 : The 56th Annual Meeting of the Association for Computational Linguistics
Book subtitle proceedings of the conference : July 15-20, 2018, Melbourne, Australia
ISBN (electronic)
  • 9781948087322
Event 56th Annual Meeting of the Association for Computational Linguistics, ACL 2018
Volume | Issue number 1
Pages (from-to) 1264-1274
Number of pages 11
Publisher Stroudsburg, PA: The Association for Computational Linguistics
Organisations
  • Interfacultary Research - Institute for Logic, Language and Computation (ILLC)
Abstract

Standard machine translation systems process sentences in isolation and hence ignore extra-sentential information, even though extended context can both prevent mistakes in ambiguous cases and improve translation coherence. We introduce a context-aware neural machine translation model designed in such way that the flow of information from the extended context to the translation model can be controlled and analyzed. We experiment with an English-Russian subtitles dataset, and observe that much of what is captured by our model deals with improving pronoun translation. We measure correspondences between induced attention distributions and coreference relations and observe that the model implicitly captures anaphora. It is consistent with gains for sentences where pronouns need to be gendered in translation. Beside improvements in anaphoric cases, the model also improves in overall BLEU, both over its context-agnostic version (+0.7) and over simple concatenation of the context and source sentences (+0.6).

Document type Conference contribution
Note With supplementary note, presentation and video
Language English
Published at https://doi.org/10.18653/v1/p18-1117
Other links https://www.scopus.com/pages/publications/85063090647
Downloads
P18-1117 (Final published version)
Supplementary materials
Permalink to this page
Back