MassiveClicks: A Massively-Parallel Framework for Efficient Click Models Training

Open Access
Authors
Publication date 2024
Host editors
  • D. Zeinalipur
  • D. Blanco Heras
  • G. Pallis
  • H. Herodotou
  • D. Trihinas
  • D. Balouek
  • P. Diehl
  • T. Cojean
  • K. Fürlinger
  • M.H. Kirkeby
  • M. Nardelli
  • P. Di Sanzo
Book title Euro-Par 2023: Parallel Processing Workshops
Book subtitle Euro-Par 2023 International Workshops, Limassol, Cyprus, August 28-September 1, 2023 : revised selected papers
ISBN
  • 9783031506833
ISBN (electronic)
  • 9783031506840
Series Lecture Notes in Computer Science
Event Euro-Par 2023: Parallel Processing Workshops
Volume | Issue number I
Pages (from-to) 232–245
Publisher Cham: Springer
Organisations
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract

Click logs collect user interaction with information retrieval systems (e.g., search engines). Clicks therefore become implicit feedback for such systems, and are further used to train click models, which in turn improve the quality of search and recommendations results. Click models based on expectation maximization (EM) are known to be effective and robust against various biases.

Training EM-based models is challenging due to the size of click logs, and can take many hours when using sequential tools like PyClick. Alternatives, such as ParClick, employ parallelism and show significant speed-up. However, ParClick only works on single-node multi-core systems. To further scale up and out, in this work we introduce MassiveClicks, the first massively parallel, distributed, multi-GPU framework for EM-based click-models training. MassiveClicks relies on efficient GPU kernels, balanced data-partitioning policies, and distributed computing to improve the performance of EM-based model training, outperforming ParClick by orders of magnitude when using GPUs and/or multiple nodes. Additionally, the framework supports heterogeneous GPU architectures, variable numbers of GPUs per node, allows for multi-node multi-core CPU-based training when no GPUs are available.

Document type Conference contribution
Language English
Published at https://doi.org/10.1007/978-3-031-50684-0_18
Downloads
978-3-031-50684-0_18 (Final published version)
Permalink to this page
Back