Overview of the CLEF 2025 SimpleText Task 2 Identify and Avoid Hallucination
| Authors |
|
|---|---|
| Publication date | 2025 |
| Host editors |
|
| Book title | Working Notes of the Conference and Labs of the Evaluation Forum (CLEF 2025) |
| Book subtitle | Madrid, Spain, 9-12 September 2025 |
| Series | CEUR Workshop Proceedings |
| Event | 26th Working Notes of the Conference and Labs of the Evaluation Forum, CLEF 2025 |
| Pages (from-to) | 4186-4204 |
| Number of pages | 19 |
| Publisher | Aachen: CEUR-WS |
| Organisations |
|
| Abstract |
This paper presents an overview of the CLEF 2025 SimpleText Task 2 on Controlled Creativity. The task aims to identify and avoid hallucination. We discuss the data and benchmarks provided for these tasks, along with preliminary insights and anticipated challenges. Our main findings are the following. First, we used aligned sources, predictions, and references in text simplification to detect and quantify hallucinations—spurious content introduced by generative models—highlighting a critical limitation of current evaluation metrics. Second, we found that overgeneration and information distortion in model outputs can be detected with high accuracy, even without access to the original source text, suggesting that automatic detection is a promising strategy. Third, while automatic methods show promise, the detailed classification of distortions remains difficult to replicate without human expertise, underscoring the continued importance of expert human evaluation and the research challenge of building effective classification models to match this. More generally, we hope and expect that the constructed corpora and evaluation data will be used by researchers to further advance information distortion detection and classification approaches, both in general and specifically for scientific text simplification models. |
| Document type | Conference contribution |
| Language | English |
| Published at | https://ceur-ws.org/Vol-4038/paper_345.pdf |
| Other links | https://ceur-ws.org/Vol-4038/ https://www.scopus.com/pages/publications/105019058790 |
| Downloads |
paper_345
(Final published version)
|
| Permalink to this page | |
