Exploring solutions for low-resource neural machine translation

A. Araabi

Exploring solutions for low-resource neural machine translation

Authors	A. Araabi
Supervisors	C. Monz
Cosupervisors	V. Niculae
Award date	09-12-2024
ISBN	9789465066684
Number of pages	124
Organisations	Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract	Neural machine translation (NMT) has advanced machine translation using deep learning, yet its reliance on large parallel datasets limits performance for low-resource languages. This thesis addresses this gap by optimizing model design, data processing, and attention mechanisms to improve the performance of NMT systems under data-scarce conditions, without relying on additional data. First, we optimize Transformer hyperparameters—such as feed-forward dimensions, attention heads, and dropout rates—achieving notable improvements for low-resource language pairs. We then examine byte pair encoding (BPE) as a pre-processing step, showing its effectiveness in managing rare words across linguistically similar languages, though with limitations in handling certain out-of-vocabulary (OOV) cases. To improve model generalization, we introduce Joint Dropout (JD), a data-centric approach inspired by phrase-based machine translation, which replaces equivalent phrase pairs with joint variables. JD enhances the model’s robustness to variations in input. Additionally, we propose Entropy- and Distance-Regularized Attention (EaDRA), which refines attention mechanisms to focus on key input elements, emulating attention patterns found in high-resource models. Together, these approaches offer practical advancements in low-resource NMT by addressing challenges related to data scarcity, OOV words, and attention mechanisms. This work contributes to bridging the gap between high-resource and low-resource machine translation and supports the development of more accessible language technologies.
Document type	PhD thesis
Language	English
Downloads	Thesis
Permalink to this page

Back

UvA-DARE

Digital Academic Repository

Exploring solutions for low-resource neural machine translation