CEQE to SQET: A study of contextualized embeddings for query expansion

Open Access
Authors
Publication date 06-2022
Journal Information Retrieval Journal
Event EUROPEAN CONFERENCE ON INFORMATION RETRIEVAL (ECIR) 2021
Volume | Issue number 25 | 2
Pages (from-to) 184–208
Organisations
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract
In this work, we study recent advances in context-sensitive language models for the task of query expansion. We study the behavior of existing and new approaches for lexical word-based expansion in both unsupervised and supervised contexts. For unsupervised models, we study the behavior of the Contextualized Embeddings for Query Expansion (CEQE) model. We introduce a new model, Supervised Contextualized Query Expansion with Transformers (SQET) that performs expansion as a supervised classification task and leverages context in pseudo-relevant results. We study the behavior of these expansion approaches for the tasks of ad-hoc document and passage retrieval. We conduct experiments combining expansion with probabilistic retrieval models as well as neural document ranking models. We evaluate expansion effectiveness on three standard TREC collections: Robust, Complex Answer Retrieval, and Deep Learning. We analyze the results of extrinsic retrieval effectiveness, intrinsic ability to rank expansion terms, and perform a qualitative analysis of the differences between the methods. We find out CEQE statically significantly outperforms static embeddings across all three datasets for Recall@1000. Moreover, CEQE outperforms static embedding-based expansion methods on multiple collections (by up to 18% on Robust and 31% on Deep Learning on average precision) and also improves over proven probabilistic pseudo-relevance feedback (PRF) models. SQET outperforms CEQE by 6% in P@20 on the intrinsic term ranking evaluation and is approximately as effective in retrieval performance. Models incorporating neural and CEQE-based expansion score achieves gains of up to 5% in P@20 and 2% in AP on Robust over the state-of-the-art transformer-based re-ranking model, Birch.
Document type Article
Note In Special Issue on ECIR 2021.
Language English
Published at https://doi.org/10.1007/s10791-022-09405-y
Downloads
s10791-022-09405-y (Final published version)
Permalink to this page
Back