Technology assisted reviews: Finding the last few relevant documents by asking yes/no questions to reviewers

Open Access
Authors
Publication date 2018
Book title SIGIR #41 proceedings
Book subtitle Ann Arbor, Michigan, USA, 08-12, July 2018
ISBN (electronic)
  • 9781450356572
Event 41st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2018
Pages (from-to) 949-952
Number of pages 4
Publisher New York, NY: Association for Computing Machinery
Organisations
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract

The goal of a technology-assisted review is to achieve high recall with low human effort. Continuous active learning algorithms have demonstrated good performance in locating the majority of relevant documents in a collection, however their performance is reaching a plateau when 80%-90% of them has been found. Finding the last few relevant documents typically requires exhaustively reviewing the collection. In this paper, we propose a novel method to identify these last few, but significant, documents efficiently. Our method makes the hypothesis that entities carry vital information in documents, and that reviewers can answer questions about the presence or absence of an entity in the missing relevance documents. Based on this we devise a sequential Bayesian search method that selects the optimal sequence of questions to ask. The experimental results show that our proposed method can greatly improve performance requiring less reviewing effort.

Document type Conference contribution
Language English
Related publication Towards Question-based High-recall Information Retrieval
Published at https://doi.org/10.1145/3209978.3210102
Other links https://www.scopus.com/pages/publications/85051559451
Downloads
p949-zou (Final published version)
Permalink to this page
Back