Are We Really Achieving Better Beyond-Accuracy Performance in Next Basket Recommendation?

M. Li; Y. Liu; S. Jullien; M. Ariannezhad; A. Yates; M. Aliannejadi; M. de Rijke

doi:https://doi.org/10.1145/3626772.3657835

Are We Really Achieving Better Beyond-Accuracy Performance in Next Basket Recommendation?

Authors	M. Li Y. Liu S. Jullien M. Ariannezhad A. Yates M. Aliannejadi M. de Rijke
Publication date	2024
Book title	SIGIR '24
Book subtitle	Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval : July 14-18, 2024, Washington, DC, USA
ISBN (electronic)	9798400704314
Event	47th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2024
Pages (from-to)	924-934
Number of pages	11
Publisher	New York, NY: Association for Computing Machinery
Organisations	Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract	Next basket recommendation ( NBR) is a special type of sequential recommendation that is increasingly receiving attention. So far, most NBR studies have focused on optimizing the accuracy of the recommendation, whereas optimizing for beyond-accuracy metrics, e.g., item fairness and diversity remains largely unexplored. Recent studies into NBR have found a substantial performance difference between recommending repeat items and explore items. Repeat items contribute most of the users' perceived accuracy compared with explore items. Informed by these findings, we identify a potential "short-cut" to optimize for beyond-accuracy metrics while maintaining high accuracy. To leverage and verify the existence of such short-cuts, we propose a plug-and-play two-step repetition-exploration (TREx) framework that treats repeat items and explores items separately, where we design a simple yet highly effective repetition module to ensure high accuracy, while two exploration modules target optimizing only beyond-accuracy metrics. Experiments are performed on two widely-used datasets w.r.t. a range of beyond-accuracy metrics, viz. five fairness metrics and three diversity metrics. Our experimental results show that: (i) we can achieve state-of-the-art performance w.r.t. accuracy via the designed repetition module in TREx; and (ii) the simple TREx framework achieves "better" beyond-accuracy performance than existing sophisticated methods. Prima facie, this appears to be good news: we can achieve high accuracy and improved beyond-accuracy metrics at the same time. However, we argue that the real-world value of our algorithmic solution, TREx, is likely to be limited and reflect on the reasonableness of the evaluation setup. We end up challenging existing evaluation paradigms, particularly in the context of beyond-accuracy metrics, and provide insights for researchers to navigate potential pitfalls and determine reasonable metrics to consider when optimizing for accuracy and beyond-accuracy metrics.
Document type	Conference contribution
Language	English
Published at	https://doi.org/10.1145/3626772.3657835
Downloads	3626772.3657835 (Final published version)
Permalink to this page

Back

UvA-DARE

Digital Academic Repository

Are We Really Achieving Better Beyond-Accuracy Performance in Next Basket Recommendation?