ChatGPT is not a good indigenous translator
| Authors | |
|---|---|
| Publication date | 2023 |
| Host editors |
|
| Book title | Third Workshop on Natural Language Processing for Indigenous Languages of the Americas |
| Book subtitle | Proceedings of the Workshop on Natural Language Processing for Indigenous Languages of the Americas (AmericasNLP) : July 14, 2023 |
| ISBN (electronic) |
|
| Event | 3rd Workshop on Natural Language Processing for Indigenous Languages of the Americas |
| Pages (from-to) | 163–167 |
| Number of pages | 5 |
| Publisher | Stroudsburg, PA: Association for Computational Linguistics |
| Organisations |
|
| Abstract |
This report investigates the continuous challenges of Machine Translation (MT) systems on indigenous and extremely low-resource language pairs. Despite the notable achievements of Large Language Models (LLMs) that excel in various tasks, their applicability to low-resource languages remains questionable. In this study, we leveraged the AmericasNLP competition to evaluate the translation performance of different systems for Spanish to 11 indigenous languages from South America. Our team, LTLAmsterdam, submitted a total of four systems including GPT-4, a bilingual model, fine-tuned M2M100, and a combination of fine-tuned M2M100 with kNN-MT. We found that even large language models like GPT-4 are not well-suited for extremely low-resource languages. Our results suggest that fine-tuning M2M100 models can offer significantly better performance for extremely low-resource translation.
|
| Document type | Conference contribution |
| Language | English |
| Published at | https://doi.org/10.18653/v1/2023.americasnlp-1.17 |
| Downloads |
2023.americasnlp-1.17
(Final published version)
|
| Permalink to this page | |