Evaluating Large Language Models on Lithuanian Grammatical Cases

Open Access
Authors
Publication date 2026
Host editors
  • Hansi Hettiarachchi
  • Tharindu Ranasinghe
  • Alistair Plum
  • Paul Rayson
  • Ruslan Mitkov
  • Mohamed Gaber
  • Damith Premasiri
  • Fiona Anting Tan
  • Lasitha Uyangodage
Book title The Second Workshop on Language Models for Low-Resource Languages : proceedings of the workshop
Book subtitle LoResLM 2026 : March 29, 2026
ISBN (electronic)
  • 9798891763777
Event 2nd Workshop on Language Models for Low-Resource Languages
Pages (from-to) 371-377
Number of pages 7
Publisher Kerrville, TX: Association for Computational Linguistics
Organisations
  • Interfacultary Research - Institute for Logic, Language and Computation (ILLC)
Abstract
We present a systematic evaluation of large language models (LLMs) on Lithuanian grammatical case marking, a task that has received little prior attention. Lithuanian is a relatively low-resource language, with rich morphology and explicit marking. To enable fine-grained syntactic and morphological assessment, we introduce a novel dataset of 305 minimal sentence pairs contrasting correct and incorrect case usage. Our results show that case marking is challenging for current models, with overall accuracy ranging from 0.662 to 0.852. A monolingual Lithuanian LLM consistently outperforms multilingual counterparts, highlighting the value of language-specific training over model size. Performance varies across cases: genitive and locative forms are generally better handled, while rarer constructions and subtle functional distinctions remain difficult. The dataset and analysis provide a resource for future work, supporting the development of more robust LLMs and targeted evaluation benchmarks for morphologically rich, low-resource languages.
Document type Conference contribution
Language English
Published at https://doi.org/10.18653/v1/2026.loreslm-1.32
Downloads
2026.loreslm-1.32 (Final published version)
Permalink to this page
Back