Do Images Clarify? A Study on the Effect of Images on Clarifying Questions in Conversational Search

Clemencia Siro; Zahra Abbasiantaeb; Yifei Yuan; Mohammad Aliannejadi; Maarten de Rijke

doi:https://doi.org/10.1145/3698204.3716464

Do Images Clarify? A Study on the Effect of Images on Clarifying Questions in Conversational Search

Authors	Clemencia Siro Zahra Abbasiantaeb Yifei Yuan Mohammad Aliannejadi Maarten de Rijke
Publication date	2025
Book title	CHIIR'25
Book subtitle	Proceedings of the 2025 Conference on Human Information Interaction and Retrieval : March 24-28, 2025, Naarm/Melbourne, Australia
ISBN (electronic)	9798400712906
Event	2025 ACM SIGIR Conference on Human Information Interaction and Retrieval, CHIIR 2025
Pages (from-to)	273-291
Number of pages	19
Publisher	New York, New York: Association for Computing Machinery
Organisations	Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract	Conversational search (CS) systems increasingly employ clarifying questions to refine user queries and improve the search experience. Previous studies have demonstrated the usefulness of text-based clarifying questions in enhancing both retrieval performance and user experience. While images have been shown to improve retrieval performance in various contexts, their impact on user performance, when incorporated into clarifying questions, remains largely unexplored. We conduct a user study with 73 participants to investigate the role of images in CS, specifically examining their effects on two search-related tasks: (i) answering clarifying questions, and (ii) query reformulation. We compare the effect of multimodal and text-only clarifying questions in both tasks within a CS context from various perspectives. Our findings reveal that while participants showed a strong preference for multimodal questions when answering clarifying questions, preferences were more balanced in the query reformulation task. The impact of images varied with both task type and user expertise: in answering clarifying questions, images helped maintain engagement across different expertise levels, while in query reformulation, they led to more precise queries and improved retrieval performance. Interestingly, for clarifying question answers, text-only setups demonstrated better user performance as they provided more comprehensive textual information in the absence of images. These results provide valuable insights for designing effective multimodal CS systems, highlighting that the benefits of visual augmentation are task-dependent and should be strategically implemented based on the specific search context and user characteristics.
Document type	Conference contribution
Language	English
Published at	https://doi.org/10.1145/3698204.3716464 (Final published version)
Other links	https://www.scopus.com/pages/publications/105005276315
Downloads	3698204.3716464 (Final published version)
Permalink to this page

Back

UvA-DARE

Digital Academic Repository

Do Images Clarify? A Study on the Effect of Images on Clarifying Questions in Conversational Search