Naming, Describing, and Quantifying Visual Objects in Humans and LLMs

Open Access
Authors
Publication date 2024
Host editors
  • L.-W. Ku
  • A. Martins
  • V. Srikumar
Book title The 62nd Annual Meeting of the Association for Computational Linguistics (ACL 2024) : proceedings of the conference
Book subtitle ACL 2024 : August 11-16, 2024
ISBN (electronic)
  • 9798891760950
Event 62nd Annual Meeting of the Association for Computational Linguistics, ACL 2024
Volume | Issue number 2
Pages (from-to) 547-557
Number of pages 11
Publisher Kerrville, TX: Association for Computational Linguistics
Organisations
  • Interfacultary Research - Institute for Logic, Language and Computation (ILLC)
Abstract

While human speakers use a variety of different expressions when describing the same object in an image, giving rise to a distribution of plausible labels driven by pragmatic constraints, the extent to which current Vision & Language Large Language Models (VLLMs) can mimic this crucial feature of language use is an open question. This applies to common, everyday objects, but it is particularly interesting for uncommon or novel objects for which a category label may be lacking or fuzzy. Furthermore, similar patterns of variation are observed among human speakers for highly context-sensitive expressions, such as the quantifiers ‘few’ or ‘most’. In our work, we evaluate VLLMs (FROMAGe, BLIP-2, LLaVA) on three categories (nouns, attributes, and quantifiers) where humans show great subjective variability concerning the distribution over plausible labels, using datasets and resources mostly under-explored in previous work. Our results reveal mixed evidence on the ability of VLLMs to capture human naming preferences at generation time: while some models are good at mimicking human distributions for nouns and attributes, all of them fail to assign quantifiers, a task that requires more accurate, high-level reasoning.

Document type Conference contribution
Language English
Published at https://doi.org/10.18653/v1/2024.acl-short.50
Other links https://www.scopus.com/pages/publications/85203825376
Downloads
2024.acl-short.50 (Final published version)
Permalink to this page
Back