ParClick: A Scalable Algorithm for EM-based Click Models

Open Access
Authors
Publication date 2022
Book title WWW'22
Book subtitle proceedings of the ACM Web Conference 2022 : April 25-29, 2022, VIrtual Event, Lyon, France
ISBN (electronic)
  • 9781450390965
Event ACM Web Conference
Pages (from-to) 392-400
Publisher New York, NY: Association for Computing Machinery
Organisations
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract
Research on click models usually focuses on developing effective approaches to reduce biases in user clicks. However, one of the major drawbacks of existing click models is the lack of scalability. In this work, we tackle the scalability of Expectation-Maximization (EM)-based click models by introducing ParClick, a new parallel algorithm designed by following the Partitioning-Communication-Aggregation-Mapping (PCAM) method. To this end, we first provide a generic formulation of EM-based click models. Then, we design an efficient parallel version of this generic click model following the PCAM approach: we partition user click logs and model parameters into separate tasks, analyze communication among them, and aggregate these tasks to reduce communication overhead. Finally, we provide a scalable, parallel implementation of the proposed design, which maps well on a multi-core machine. Our experiments on the Yandex relevance prediction dataset show that ParClick scales well when increasing the amount of training data and computational resources. In particular, ParClick is 24.7 times faster to train with 40 million search sessions and 40 threads compared to the standard sequential version of the Click Chain Model (CCM) without any degradation in effectiveness.
Document type Conference contribution
Language English
Published at https://doi.org/10.1145/3485447.3511967
Other links https://github.com/markovi/PyClick
Downloads
3485447.3511967 (Final published version)
Permalink to this page
Back