Impact of Tokenization, Pretraining Task, and Transformer Depth on Text Ranking

J. Kamps; N. Kondylidis; D. Rau

Impact of Tokenization, Pretraining Task, and Transformer Depth on Text Ranking

Authors	J. Kamps N. Kondylidis D. Rau
Publication date	2021
Host editors	E.M. Voorhees A. Ellis
Book title	The Twenty-Ninth Text REtrieval Conference (TREC 2020) Proceedings
Series	NIST Special Publication, SP 1266
Event	29th Text REtrieval Conference, TREC 2020
Number of pages	8
Publisher	Gaithersburg, MD: National Institute of Standards and Technology
Organisations	Faculty of Humanities (FGw) - Amsterdam Institute for Humanities Research (AIHR) Interfacultary Research - Institute for Logic, Language and Computation (ILLC)
Abstract	This paper documents the University of Amsterdam’s participation in the TREC 2020 Deep Learning Track. Rather than motivated by engineering the best scoring system, our work is motivated by our interest in analysis, informing our understanding of the opportunities and challenges of transformers for text ranking. Specifically, we focus on the passage retrieval task where we try to answer three of sets of questions.First, transformers use different tokenization than traditional IR approaches such as stemming and lemmatizing, leading to different document representations. What is the effect of modern preprocessing techniques on traditional retrieval algorithms? Our main observation is that the limited vocabulary of the BERT tokenizer is affecting many long-tail tokens, which leads to large gains in efficiency at the cost of a small decrease in effectiveness.Second, the effectiveness of transformers is a result of the self-supervised pre-training task promoting general language understanding, ignorant of the specific demands of ranking tasks.Can we make further correlate queries and relevant passages in the pre-training task? Our main observation is that there is a whole continuum between the original self-supervised training task of BERT and the final interaction ranker, and isolating ranking-aware pre-training tasks may leads to gains in efficiency (as these pretrained models can be reused for many tasks) as well as to gains in effectiveness (in particular when limited data on the target task is available). Third, transformers combine large sequence length with many layers, with unclear what this deep semantics adds in the context of ranking. How complex do the models need to be in order to perform well on this task? Our main observation is that the deep layers of BERT lead to some, but relatively modest, gains in performance, but that the exact role of the presumed superior language understanding for search is far from clear.
Document type	Conference contribution
Language	English
Published at	https://trec.nist.gov/pubs/trec29/papers/UAmsterdam.DL.pdf
Other links	https://trec.nist.gov/pubs/trec29/trec2020.html
Downloads	UAmsterdam.DL (Final published version)
Permalink to this page

Back

UvA-DARE

Digital Academic Repository

Impact of Tokenization, Pretraining Task, and Transformer Depth on Text Ranking