When to Stop Reviewing in Technology-Assisted Reviews

D. Li; E. Kanoulas

doi:https://doi.org/10.1145/3411755

When to Stop Reviewing in Technology-Assisted Reviews Sampling from an Adaptive Distribution to Estimate Residual Relevant Documents

Authors	D. Li E. Kanoulas
Publication date	10-2020
Journal	ACM Transactions on Information Systems
Article number	41
Volume \| Issue number	38 \| 4
Number of pages	36
Organisations	Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract	Technology-Assisted Reviews (TAR) aim to expedite document reviewing (e.g., medical articles or legal documents) by iteratively incorporating machine learning algorithms and human feedback on document relevance. Continuous Active Learning (CAL) algorithms have demonstrated superior performance compared to other methods in efficiently identifying relevant documents. One of the key challenges for CAL algorithms is deciding when to stop displaying documents to reviewers. Existing work either lacks transparency-it provides an ad-hoc stopping point, without indicating how many relevant documents are still not found, or lacks efficiency by paying an extra cost to estimate the total number of relevant documents in the collection prior to the actual review. In this article, we handle the problem of deciding the stopping point of TAR under the continuous active learning framework by jointly training a ranking model to rank documents, and by conducting a "greedy"sampling to estimate the total number of relevant documents in the collection. We prove the unbiasedness of the proposed estimators under a with-replacement sampling design, while experimental results demonstrate that the proposed approach, similar to CAL, effectively retrieves relevant documents; but it also provides a transparent, accurate, and effective stopping point.
Document type	Article
Language	English
Published at	https://doi.org/10.1145/3411755
Other links	https://www.scopus.com/pages/publications/85093981909
Downloads	3411755 (Final published version)
Permalink to this page

Back

UvA-DARE

Digital Academic Repository

When to Stop Reviewing in Technology-Assisted Reviews Sampling from an Adaptive Distribution to Estimate Residual Relevant Documents