Examining Modularity in Multilingual LMs via Language-Specialized Subnetworks
| Authors |
|
|---|---|
| Publication date | 2024 |
| Host editors |
|
| Book title | Findings of the Association for Computational Linguistics: NAACL 2024 |
| Book subtitle | Findings : June 16-21, 2024 |
| ISBN (electronic) |
|
| Event | 2024 Annual Conference of the North American Association for Computational Linguistics: Findings |
| Pages (from-to) | 287-301 |
| Publisher | Kerrville, TX: Association for Computational Linguistics |
| Organisations |
|
| Abstract |
Recent work has proposed explicitly inducing language-wise modularity in multilingual LMs via sparse fine-tuning (SFT) on per-language subnetworks as a means of better guiding cross-lingual sharing. In this paper, we investigate (1) the degree to which language-wise modularity *naturally* arises within models with no special modularity interventions, and (2) how cross-lingual sharing and interference differ between such models and those with explicit SFT-guided subnetwork modularity. In order to do so, we use XLM-R as our multilingual LM. Moreover, to quantify language specialization and cross-lingual interaction, we use a Training Data Attribution method that estimates the degree to which a model’s predictions are influenced by in-language or cross-language training examples. Our results show that language-specialized subnetworks do naturally arise, and that SFT, rather than always increasing modularity, can decrease language specialization of subnetworks in favor of more cross-lingual sharing.
|
| Document type | Conference contribution |
| Language | English |
| Published at | https://doi.org/10.18653/v1/2024.findings-naacl.21 |
| Downloads |
2024.findings-naacl.21
(Final published version)
|
| Permalink to this page | |