Alternative objective functions for training MT evaluation metrics

Authors
Publication date 2017
Host editors
  • R. Barzilay
  • M.-Y. Kan
Book title The 55th Annual Meeting of the Association for Computational Linguistics
Book subtitle proceedings of the Conference : July 30-August 4, 2017, Vancouver, Canada
ISBN (electronic)
  • 9781945626760
Event 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017
Volume | Issue number 2
Pages (from-to) 20-25
Number of pages 6
Publisher Stroudsburg, PA: Association for Computational Linguistics
Organisations
  • Faculty of Science (FNWI)
  • Interfacultary Research - Institute for Logic, Language and Computation (ILLC)
Abstract

MT evaluation metrics are tested for correlation with human judgments either at the sentence- or the corpus-level. Trained metrics ignore corpus-level judgments and are trained for high sentence-level correlation only. We show that training only for one objective (sentence or corpus level), can not only harm the performance on the other objective, but it can also be suboptimal for the objective being optimized. To this end we present a metric trained for corpus-level and show empirical comparison against a metric trained for sentence-level exemplifying how their performance may vary per language pair, type and level of judgment. Subsequently we propose a model trained to optimize both objectives simultaneously and show that it is far more stable than–and on average outperforms–both models on both objectives.

Document type Conference contribution
Language English
Published at https://doi.org/10.18653/v1/P17-2004
Other links https://www.scopus.com/pages/publications/85040568754
Permalink to this page
Back