Dense Retrieval with Entity Views

Open Access
Authors
Publication date 2022
Book title CIKM '22
Book subtitle proceedings of the 31st ACM International Conference on Information & Knowledge Management : October 17-21, 2022, Atlanta, GA, USA
ISBN (electronic)
  • 9781450392365
Event 31st ACM International Conference on Information and Knowledge Management, CIKM 2022
Pages (from-to) 1955–1964
Publisher New York, NY: The Association for Computing Machinery
Organisations
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract
Pre-trained language models like BERT have been demonstrated to be both effective and efficient ranking methods when combined with approximate nearest neighbor search, which can quickly match dense representations of queries and documents. However, pretrained language models alone do not fully capture information about uncommon entities. In this work, we investigate methods for enriching dense query and document representations with entity information from an external source. Our proposed method identifies groups of entities in a text and encodes them into a dense vector representation, which is then used to enrich BERT's vector representation of the text. To handle documents that contain many loosely-related entities, we devise a strategy for creating multiple entity representations that reflect different views of a document. For example, a document about a scientist may cover aspects of her personal life and recent work, which correspond to different views of the entity. In an evaluation on MS MARCO benchmarks, we find that enriching query and document representations in this way yields substantial increases in effectiveness.
Document type Conference contribution
Language English
Published at https://doi.org/10.1145/3511808.3557285
Downloads
3511808.3557285 (Final published version)
Permalink to this page
Back