Overview and Joint Report of the Robustness and Consistency Task of the ELOQUENT 2025 Lab for Evaluating Generative Language Model Quality

Overview and Joint Report of the Robustness and Consistency Task of the ELOQUENT 2025 Lab for Evaluating Generative Language Model Quality Notebook for the ELOQUENT Lab at CLEF 2025

Authors	Jussi Karlgren Marie Isabel Engels Maria Barrett Rohit Raj Gunti Mohanna Hoveyda Bruno Nadalic Sotic Jaap Kamps Mika Koistinen Elaine Zosa
Publication date	2025
Host editors	G. Faggioli N. Ferro P. Rosso D. Spina
Book title	Working Notes of the Conference and Labs of the Evaluation Forum (CLEF 2025)
Book subtitle	Madrid, Spain, 9-12 September 2025
Series	CEUR Workshop Proceedings
Event	26th Working Notes of the Conference and Labs of the Evaluation Forum, CLEF 2025
Pages (from-to)	1306-1319
Number of pages	14
Publisher	Aachen: CEUR-WS
Organisations	Faculty of Science (FNWI) - Informatics Institute (IVI) Interfacultary Research - Institute for Logic, Language and Computation (ILLC)
Abstract	Generative language models are intended to be creative and responsive to the style of the conversation they engage in. The experimental Robustness and Consistency task is designed to explore how variation between content-wise equivalent inputs influences the output of a generative language model, and in this year’s edition the task focuses on how linguistic variation makes a difference for value-oriented questions. This paper is a joint report by all participants in the task.
Document type	Conference contribution
Language	English
Published at	https://ceur-ws.org/Vol-4038/paper_104.pdf (Final published version)
Other links	https://ceur-ws.org/Vol-4038/ https://www.scopus.com/pages/publications/105019040432
Downloads	paper_104 (Final published version)
Permalink to this page

Back