Latent Feature-based Data Splits to Improve Generalisation Evaluation A Hate Speech Detection Case Study

Open Access
Authors
Publication date 2023
Host editors
  • D. Hupkes
  • V. Dankers
  • K. Batsuren
  • K. Sinha
  • A. Kazemnejad
  • C. Christodoulopoulos
  • R. Cotterell
  • E. Bruni
Book title GenBench: The first workshop on generalisation (benchmarking) in NLP
Book subtitle GenBench 2023 : Proceedings of the Workshop : December 6, 2023
ISBN (electronic)
  • 9798891760424
Event 1st Workshop on Generalisation (Benchmarking) in NLP, GenBench 2023
Pages (from-to) 112–129
Publisher Stroudsburg, PA: Association for Computational Linguistics
Organisations
  • Interfacultary Research - Institute for Logic, Language and Computation (ILLC)
Abstract

With the ever-growing presence of social media platforms comes the increased spread of harmful content and the need for robust hate speech detection systems. Such systems easily overfit to specific targets and keywords, and evaluating them without considering distribution shifts that might occur between train and test data overestimates their benefit. We challenge hate speech models via new train-test splits of existing datasets that rely on the clustering of models' hidden representations. We present two split variants (SUBSET-SUM-SPLIT and CLOSEST-SPLIT) that, when applied to two datasets using four pretrained models, reveal how models catastrophically fail on blind spots in the latent space. This result generalises when developing a split with one model and evaluating it on another. Our analysis suggests that there is no clear surface-level property of the data split that correlates with the decreased performance, which underscores that task difficulty is not always humanly interpretable. We recommend incorporating latent feature-based splits in model development and release two splits via the GenBench benchmark.

Document type Conference contribution
Language English
Published at https://doi.org/10.18653/v1/2023.genbench-1.9
Other links https://www.scopus.com/pages/publications/85184516910
Downloads
2023.genbench-1.9 (Final published version)
Permalink to this page
Back