Exploring solutions for low-resource neural machine translation

Open Access
Authors
Supervisors
Cosupervisors
Award date 09-12-2024
ISBN
  • 9789465066684
Number of pages 124
Organisations
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract
Neural machine translation (NMT) has advanced machine translation using deep learning, yet its reliance on large parallel datasets limits performance for low-resource languages. This thesis addresses this gap by optimizing model design, data processing, and attention mechanisms to improve the performance of NMT systems under data-scarce conditions, without relying on additional data. First, we optimize Transformer hyperparameters—such as feed-forward dimensions, attention heads, and dropout rates—achieving notable improvements for low-resource language pairs. We then examine byte pair encoding (BPE) as a pre-processing step, showing its effectiveness in managing rare words across linguistically similar languages, though with limitations in handling certain out-of-vocabulary (OOV) cases. To improve model generalization, we introduce Joint Dropout (JD), a data-centric approach inspired by phrase-based machine translation, which replaces equivalent phrase pairs with joint variables. JD enhances the model’s robustness to variations in input. Additionally, we propose Entropy- and Distance-Regularized Attention (EaDRA), which refines attention mechanisms to focus on key input elements, emulating attention patterns found in high-resource models. Together, these approaches offer practical advancements in low-resource NMT by addressing challenges related to data scarcity, OOV words, and attention mechanisms. This work contributes to bridging the gap between high-resource and low-resource machine translation and supports the development of more accessible language technologies.
Document type PhD thesis
Language English
Downloads
Permalink to this page
cover
Back