Measuring the Effect of Conversational Aspects on Machine Translation Quality

M. van der Wees; A. Bisazza; C. Monz

Measuring the Effect of Conversational Aspects on Machine Translation Quality

Authors	M. van der Wees A. Bisazza C. Monz
Publication date	2016
Host editors	Y. Matsumoto R. Prasad
Book title	Proceedings of COLING 2016: technical papers
Book subtitle	the 26th International Conference on Computational Linguistics : Osaka, Japan, December 11-17 2016
ISBN (electronic)	9784879747020
Event	The 26th International Conference on Computational Linguistics
Pages (from-to)	2571-2581
Number of pages	11
Publisher	The COLING 2016 Organizing Committee
Organisations	Faculty of Science (FNWI) - Informatics Institute (IVI) Faculty of Science (FNWI)
Abstract	Research in statistical machine translation (SMT) is largely driven by formal translation tasks, while translating informal text is much more challenging. In this paper we focus on SMT for the informal genre of dialogues, which has rarely been addressed to date. Concretely, we investigate the effect of dialogue acts, speakers, gender, and text register on SMT quality when translating fictional dialogues. We first create and release a corpus of multilingual movie dialogues annotated with these four dialogue-specific aspects. When measuring translation performance for each of these variables, we find that BLEU fluctuations between their categories are often significantly larger than randomly expected. Following this finding, we hypothesize and show that SMT of fictional dialogues benefits from adaptation towards dialogue acts and registers. Finally, we find that male speakers are harder to translate and use more vulgar language than female speakers, and that vulgarity is often not preserved during translation.
Document type	Conference contribution
Language	English
Published at	http://aclweb.org/anthology/C16-1242
Downloads	C16-1242 (Final published version)
Permalink to this page

Back

UvA-DARE

Digital Academic Repository

Measuring the Effect of Conversational Aspects on Machine Translation Quality