Evaluation of Machine Translation Performance Across Multiple Genres and Languages

Open Access
Authors
Publication date 2018
Host editors
  • N. Calzolari
  • K. Choukri
  • C. Cieri
  • T. Declerck
  • S. Goggi
  • K. Hasida
  • H. Isahara
  • B. Maegaard
  • J. Mariani
  • H. Mazo
  • A. Moreno
  • J. Odijk
  • S. Piperidis
  • T. Tokunaga
Book title LREC 2018 : Eleventh International Conference on Language Resources and Evaluation
Book subtitle May 7-12, 2018, Miyazaki, Japan
ISBN (electronic)
  • 9791095546009
Event 11th Language Resources and Evaluation Conference
Pages (from-to) 3822-3827
Publisher Paris: European Language Resources Association (ELRA)
Organisations
  • Faculty of Science (FNWI)
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract
In this paper, we present evaluation corpora covering four genres for four language pairs that we harvested from the web in an automated fashion. We use these multi-genre benchmarks to evaluate the impact of genre differences on machine translation (MT). We observe that BLEU score differences between genres can be large and that, for all genres and all language pairs, translation quality improves when using four genre-optimized systems rather than a single genre-agnostic system. Finally, we train and use genre classifiers to route test documents to the most appropriate genre systems. The results of these experiments show that our multi-genre benchmarks can serve to advance research on text genre adaptation for MT.
Document type Conference contribution
Language English
Published at http://www.lrec-conf.org/proceedings/lrec2018/summaries/853.html https://staff.science.uva.nl/c.monz/html/publications/lrec2018eval.pdf
Other links http://www.lrec-conf.org/proceedings/lrec2018/index.html
Downloads
853 (Final published version)
Permalink to this page
Back