Multilinguality and multiculturalism

R.M.V. Choenni

Multilinguality and multiculturalism Towards effective and inclusive neural language models

Authors	R.M.V. Choenni
Supervisors	R.A.M. van Rooij
Cosupervisors	E.V. Shutova D. Garrette
Award date	22-01-2025
ISBN	9789464736526
Series	ILLC Dissertation series , DS-2024-14
Number of pages	230
Organisations	Faculty of Science (FNWI) Interfacultary Research - Institute for Logic, Language and Computation (ILLC)
Abstract	Language models require vast amounts of data during training. This limits their use to languages for which such data requirements can be met. To extend access to language technology to more linguistic communities, researchers have developed multilingual language models (MLMs) that are trained on data from multiple languages. The idea is that languages can support each other as they share common patterns, making the models useful across more languages. However, this approach brings new challenges from both a technical and social perspective. When a model is trained on multiple languages, these languages start competing for limited model capacity, which can lead to negative interference and reduce effectiveness. In addition, to deploy MLMs in culturally-diverse communities, their output needs to be sensitive to the sociocultural norms and biases of those communities. This necessitates MLMs to become inherently multicultural as well. In this thesis, we investigate how to build more effective MLMs that mitigate negative cross-language interference and study the effect that multilingual training has on the social biases and cultural values that they encode.
Document type	PhD thesis
Language	English
Downloads	Thesis
Permalink to this page

Back

UvA-DARE

Digital Academic Repository

Multilinguality and multiculturalism Towards effective and inclusive neural language models