UBOnlp Report at the SimpleText lab of CLEF 2025

UBOnlp Report at the SimpleText lab of CLEF 2025

Authors	Benjamin Vendeville Liana Ermakova Pierre De Loor Jaap Kamps
Publication date	2025
Host editors	G. Faggioli N. Ferro P. Rosso D. Spina
Book title	Working Notes of the Conference and Labs of the Evaluation Forum (CLEF 2025)
Book subtitle	Madrid, Spain, 9-12 September 2025
Series	CEUR Workshop Proceedings
Event	26th Working Notes of the Conference and Labs of the Evaluation Forum, CLEF 2025
Pages (from-to)	4363-4375
Number of pages	13
Publisher	Aachen: CEUR-WS
Organisations	Interfacultary Research - Institute for Logic, Language and Computation (ILLC)
Abstract	This paper presents the UBOnlp team’s participation in the SimpleText lab at CLEF 2025, focusing on scientific text simplification and controlled creativity tasks. We evaluate the performance of GPT-4o using simple prompt-based approaches across multiple subtasks without specialized training or fine-tuning. For Task 1 (Text Simplification), we applied GPT-4o to both sentence-level and document-level simplification of scientific abstracts from the Cochrane-Auto corpus. Our system achieved competitive SARI scores (42.20 for sentence-level, 43.37 for document-level) while maintaining low complexity metrics, demonstrating effective simplification through content reduction rather than lexical substitution. For Task 2 (Controlled Creativity), we addressed spurious generation detection and error classification in simplified texts. Our approach showed strong performance in fluency error detection (F1 = 0.322, ranking first) and alignment error detection (F1 = 0.381, ranking third), but struggled with general spurious content detection, particularly in post-hoc scenarios without source documents. These results highlight both the potential and limitations of large language models for specialized text simplification tasks. While GPT-4o demonstrates capabilities in linguistic quality assessment, task-specific architectures remain superior for comprehensive error detection and generation control. Our findings contribute to understanding the practical applicability of general-purpose language models in scientific text processing workflows.
Document type	Conference contribution
Language	English
Published at	https://ceur-ws.org/Vol-4038/paper_360.pdf
Other links	https://ceur-ws.org/Vol-4038/ https://www.scopus.com/pages/publications/105019055759
Downloads	paper_360 (Final published version)
Permalink to this page

Back

UvA-DARE

Digital Academic Repository

UBOnlp Report at the SimpleText lab of CLEF 2025