On the Reusability of Personalized Test Collections

Open Access
Authors
Publication date 2017
Book title UMAP'17
Book subtitle adjunct publication of the 25th Conference on User Modeling, Adaptation and Personalization : July 9-12, 2017, Bratislava, Slovakia
ISBN
  • 9781450346351
ISBN (electronic)
  • 9781450350679
Event UMAP '17: 25th Conference on User Modeling, Adaptation and Personalization
Pages (from-to) 185-189
Publisher New York, NY: The Association for Computing Machinery
Organisations
  • Faculty of Humanities (FGw)
  • Interfacultary Research - Institute for Logic, Language and Computation (ILLC)
Abstract
Test collections for offline evaluation remain crucial for information retrieval research and industrial practice, yet reusability of test collections is under threat by different factors such as dynamic nature of data collections and new trends in building retrieval systems. Specifically, building reusable test collections that last over years is a very challenging problem as retrieval approaches change considerably per year based on new trends among Information Retrieval researchers. We experiment with a novel temporal reusability test to evaluate reusability of test collections over a year based on leaving mutual topics in experiment, in which we borrow some judged topics from previous years and include them in the new set of topics to be used in the current year. In fact, we experiment whether a new set of retrieval systems can be evaluated and comparatively ranked based on an old test collection. Our experiments is done based on two sets of runs from Text REtrieval Conference (TREC) 2015 and 2016 Contextual Suggestion Track, which is a personalized venue recommendation task. Our experiments show that the TREC 2015 test collection is not temporally reusable. The test collection should be used with extreme care based on early precision metrics and slightly less care based on NDCG, bpref and MAP metrics. Our approach offers a very precise experiment to test temporal reusability of test collections over a year, and it is very effective to be used in tracks running a setup similar to their previous years.
Document type Conference contribution
Language English
Published at https://doi.org/10.1145/3099023.3099044
Downloads
p185-hashemi (Final published version)
Permalink to this page
Back