Alternative objective functions for training MT evaluation metrics

M. Stanojević; K. Sima'an

doi:https://doi.org/10.18653/v1/P17-2004

Alternative objective functions for training MT evaluation metrics

Authors	M. Stanojević K. Sima'an
Publication date	2017
Host editors	R. Barzilay M.-Y. Kan
Book title	The 55th Annual Meeting of the Association for Computational Linguistics
Book subtitle	proceedings of the Conference : July 30-August 4, 2017, Vancouver, Canada
ISBN (electronic)	9781945626760
Event	55th Annual Meeting of the Association for Computational Linguistics, ACL 2017
Volume \| Issue number	2
Pages (from-to)	20-25
Number of pages	6
Publisher	Stroudsburg, PA: Association for Computational Linguistics
Organisations	Interfacultary Research - Institute for Logic, Language and Computation (ILLC) Faculty of Science (FNWI)
Abstract	MT evaluation metrics are tested for correlation with human judgments either at the sentence- or the corpus-level. Trained metrics ignore corpus-level judgments and are trained for high sentence-level correlation only. We show that training only for one objective (sentence or corpus level), can not only harm the performance on the other objective, but it can also be suboptimal for the objective being optimized. To this end we present a metric trained for corpus-level and show empirical comparison against a metric trained for sentence-level exemplifying how their performance may vary per language pair, type and level of judgment. Subsequently we propose a model trained to optimize both objectives simultaneously and show that it is far more stable than–and on average outperforms–both models on both objectives.
Document type	Conference contribution
Language	English
Published at	https://doi.org/10.18653/v1/P17-2004
Other links	https://www.scopus.com/pages/publications/85040568754
Permalink to this page

Back

UvA-DARE

Digital Academic Repository

Alternative objective functions for training MT evaluation metrics