How Deep is your Learning: the DL-HARD Annotated Deep Learning Dataset

I. Mackie; J. Dalton; A. Yates

doi:https://doi.org/10.1145/3404835.3463262

How Deep is your Learning: the DL-HARD Annotated Deep Learning Dataset

Authors	I. Mackie J. Dalton A. Yates
Publication date	2021
Book title	SIGIR '21
Book subtitle	proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval : July 11-15, 2021, virtual event, Canada
ISBN (electronic)	9781450380379
Event	44th International ACM SIGIR Conference on Research and Development in Information Retrieval
Pages (from-to)	2335–2341
Publisher	New York, NY: Association for Computing Machinery
Organisations	Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract	Deep Learning Hard (DL-HARD) is a new annotated dataset designed to more effectively evaluate neural ranking models on complex topics. It builds on TREC Deep Learning (DL) topics by extensively annotating them with question intent categories, answer types, wikified entities, topic categories, and result type metadata from a commercial web search engine. Based on this data, we introduce a framework for identifying challenging queries. DL-HARD contains fifty topics from the official DL 2019/2020 evaluation benchmark, half of which are newly and independently assessed. We perform experiments using the official submitted runs to DL on DL-HARD and find substantial differences in metrics and the ranking of participating systems. Overall, DL-HARD is a new resource that promotes research on neural ranking methods by focusing on challenging and complex topics.
Document type	Conference contribution
Language	English
Published at	https://doi.org/10.1145/3404835.3463262
Permalink to this page

Back

UvA-DARE

Digital Academic Repository

How Deep is your Learning: the DL-HARD Annotated Deep Learning Dataset