The Healing Power of Poison: Helpful Non-relevant Documents in Feedback

Open Access
Authors
Publication date 2016
Book title CIKM'16
Book subtitle proceedings of the 2016 ACM Conference on Information and Knowledge Management : October 24-28, 2016, Indianapolis, IN, USA
ISBN
  • 9781450340731
Event 25th ACM International Conference on Information and Knowledge Management
Pages (from-to) 2065-2068
Number of pages 4
Publisher New York, NY: Association for Computing Machinery
Organisations
  • Interfacultary Research - Institute for Logic, Language and Computation (ILLC)
  • Faculty of Science (FNWI)
Abstract
The use of feedback information is an effective approach to address the vocabulary gap between a user's query and the relevant documents. It has been shown that some relevant documents act like "poison pills," i.e. they hurt the performance of feedback systems despite the fact that they are relevant. In this paper, we study the positive counterpart of this by investigating the helpfulness of nonrelevant documents in feedback. In general, we find that although documents that are explicitly judged as non-relevant are normally assumed to be poisonous for feedback systems, sometimes considering high-scored non-relevant documents as a positive feedback helps to improve the performance of retrieval. In our experimental data, we observe a considerable fraction of non-relevant documents in higher ranked positions of the initial retrieval run, for most of the topics. Hence, by ignoring the potential value of non-relevant documents, we may loose a lot of useful information. We investigate the potential contribution of non-relevant documents using existing state-of-the-art feedback methods. Our main findings are the following. First, we find that some of the nonrelevant documents are exclusively helpful, they improve retrieval on their own, and others are complementary helpful, they lead to further improvement when added to a set of relevant documents. Second, we discover that, on average, exclusively helpful non-relevant documents have a higher contribution to the performance improvement, compared to the complementary ones. Third, we show that non-relevant documents in topics with poor average precision in the initial retrieval are more likely to help in the feedback.
Document type Conference contribution
Language English
Published at https://doi.org/10.1145/2983323.2983910
Downloads
p2065-dehghani (Final published version)
Permalink to this page
Back