Probing Cross-Modal Representations in Multi-Step Relational Reasoning

doi:https://doi.org/10.18653/v1/2021.repl4nlp-1.16

Probing Cross-Modal Representations in Multi-Step Relational Reasoning

Authors	I. Parfenova D. Elliott R. Fernández S. Pezzelle
Publication date	2021
Host editors	A. Rogers I. Calixto I. Vulić N. Saphra N. Kassner O.-M. Camburu T. Bansal V. Shwartz
Book title	The 6th Workshop on Representation Learning for NLP
Book subtitle	RepL4NLP 2021 : proceedings of the workshop : August 6, 2021, Bangkok, Thailand (online)
ISBN (electronic)	9781954085725
Event	Representation Learning for NLP 2021 Workshop
Pages (from-to)	152–162
Number of pages	11
Publisher	Stroudsburg, PA: The Association for Computational Linguistics
Organisations	Interfacultary Research - Institute for Logic, Language and Computation (ILLC)
Abstract	We investigate the representations learned by vision and language models in tasks that require relational reasoning. Focusing on the problem of assessing the relative size of objects in abstract visual contexts, we analyse both one-step and two-step reasoning. For the latter, we construct a new dataset of three-image scenes and define a task that requires reasoning at the level of the individual images and across images in a scene. We probe the learned model representations using diagnostic classifiers. Our experiments show that pretrained multimodal transformer-based architectures can perform higher-level relational reasoning, and are able to learn representations for novel tasks and data that are very different from what was seen in pretraining.
Document type	Conference contribution
Language	English
Published at	https://doi.org/10.18653/v1/2021.repl4nlp-1.16
Downloads	2021.repl4nlp-1.16 (Final published version)
Permalink to this page

Back

UvA-DARE

Digital Academic Repository

Probing Cross-Modal Representations in Multi-Step Relational Reasoning