UvA-MT’s Participation in the WMT24 General Translation Shared Task

S. Tan; D. Wu; D. Stap; S. Aycock; C. Monz

doi:https://doi.org/10.18653/v1/2024.wmt-1.11

UvA-MT’s Participation in the WMT24 General Translation Shared Task

Authors	S. Tan D. Wu D. Stap S. Aycock C. Monz
Publication date	2024
Host editors	B. Haddow T. Kocmi P. Koehn C. Monz
Book title	Ninth Conference on Machine Translation : Proceedings of the Conference
Book subtitle	WMT 2024 : November 15-16, 2024
ISBN (electronic)	9798891761797
Event	9th Conference on Machine Translation
Pages (from-to)	176-184
Number of pages	9
Publisher	Kerrville, TX: Association for Computational Linguistics
Organisations	Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract	Fine-tuning Large Language Models (FT-LLMs) with parallel data has emerged as a promising paradigm in recent machine translation research. In this paper, we explore the effectiveness of FT-LLMs and compare them to traditional encoder-decoder Neural Machine Translation (NMT) systems under the WMT24 general MT shared task for English to Chinese direction. We implement several techniques, including Quality Estimation (QE) data filtering, supervised fine-tuning, and post-editing that integrate NMT systems with LLMs. We demonstrate that fine-tuning LLaMA2 on a high-quality but relatively small bitext dataset (100K) yields COMET results comparable to much smaller encoder-decoder NMT systems trained on over 22 million bitexts. However, this approach largely underperforms on surface-level metrics like BLEU and ChrF. We further control the data quality using the COMET-based quality estimation method. Our experiments show that 1) filtering low COMET scores largely improves encoder-decoder systems, but 2) no clear gains are observed for LLMs when further refining the fine-tuning set. Finally, we show that combining NMT systems with LLMs via post-editing generally yields the best performance for the WMT24 official test set.
Document type	Conference contribution
Language	English
Related publication	UvA-MT’s Participation in the WMT 2023 General Translation Shared Task
Published at	https://doi.org/10.18653/v1/2024.wmt-1.11
Downloads	2024.wmt-1.11 (Final published version)
Permalink to this page

Back

UvA-DARE

Digital Academic Repository

UvA-MT’s Participation in the WMT24 General Translation Shared Task