Detecting word sense disambiguation biases in machine translation for model-agnostic adversarial attacks

D. Emelin; I. Titov; R. Sennrich

doi:https://doi.org/10.18653/v1/2020.emnlp-main.616

Detecting word sense disambiguation biases in machine translation for model-agnostic adversarial attacks

Authors	D. Emelin I. Titov R. Sennrich
Publication date	2020
Host editors	B. Webber T. Cohn Y. He Y. Liu
Book title	2020 Conference on Empirical Methods in Natural Language Processing
Book subtitle	EMNLP 2020 : proceedings of the conference : November 16-20, 2020
ISBN (electronic)	9781952148606
Event	2020 Conference on Empirical Methods in Natural Language Processing, EMNLP 2020
Pages (from-to)	7635-7653
Number of pages	19
Publisher	Stroudsburg, PA: The Association for Computational Linguistics
Organisations	Interfacultary Research - Institute for Logic, Language and Computation (ILLC)
Abstract	Word sense disambiguation is a well-known source of translation errors in NMT. We posit that some of the incorrect disambiguation choices are due to models' over-reliance on dataset artifacts found in training data, specifically superficial word co-occurrences, rather than a deeper understanding of the source text. We introduce a method for the prediction of disambiguation errors based on statistical data properties, demonstrating its effectiveness across several domains and model types. Moreover, we develop a simple adversarial attack strategy that minimally perturbs sentences in order to elicit disambiguation errors to further probe the robustness of translation models. Our findings indicate that disambiguation robustness varies substantially between domains and that different models trained on the same data are vulnerable to different attacks.
Document type	Conference contribution
Language	English
Published at	https://doi.org/10.18653/v1/2020.emnlp-main.616
Other links	https://github.com/demelin/detecting_wsd_biases_for_nmt https://slideslive.com/38939052/ https://www.scopus.com/pages/publications/85101693661
Downloads	2020.emnlp-main.616 (Final published version)
Permalink to this page

Back

UvA-DARE

Digital Academic Repository

Detecting word sense disambiguation biases in machine translation for model-agnostic adversarial attacks