Cascading non-stationary bandits: Online learning to rank in the non-stationary cascade model

C. Li; M. de Rijke

doi:https://doi.org/10.24963/ijcai.2019/396

Cascading non-stationary bandits: Online learning to rank in the non-stationary cascade model

Authors	C. Li M. de Rijke
Publication date	2019
Host editors	S. Kraus
Book title	Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence
Book subtitle	IJCAI-19 : Macao, 10-16 August 2019
ISBN (electronic)	9780999241141
Event	International Joint Conference on Artificial Intelligence (IJCAI) 2019
Pages (from-to)	2859-2865
Number of pages	7
Publisher	International Joint Conferences on Artificial Intelligence
Organisations	Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract	Non-stationarity appears in many online applications such as web search and advertising. In this paper, we study the online learning to rank problem in a non-stationary environment where user preferences change abruptly at an unknown moment in time. We consider the problem of identifying the K most attractive items and propose cascading non-stationary bandits, an online learning variant of the cascading model, where a user browses a ranked list from top to bottom and clicks on the first attractive item. We propose two algorithms for solving this non-stationary problem: CascadeDUCB and CascadeSWUCB. We analyze their performance and derive gap-dependent upper bounds on the n-step regret of these algorithms. We also establish a lower bound on the regret for cascading non-stationary bandits and show that both algorithms match the lower bound up to a logarithmic factor. Finally, we evaluate their performance on a real-world web search click dataset.
Document type	Conference contribution
Language	English
Related publication	Cascading Non-stationary Bandits: Online Learning to Rank in the Non-stationary Cascade Model
Published at	https://doi.org/10.24963/ijcai.2019/396 (Final published version)
Downloads	1905.12370 (Accepted author manuscript)
Permalink to this page

Back

UvA-DARE

Digital Academic Repository

Cascading non-stationary bandits: Online learning to rank in the non-stationary cascade model