Contextualized word embeddings expose ethnic biases in news

Open Access
Authors
Publication date 2024
Book title WEBSCI '24 : Reflecting on the Web, AI, and Society
Book subtitle Proceedings of the 16th ACM Web Science Conference 2024 : May 21-24, 2024 : University of Stuttgart, Germany
ISBN (electronic)
  • 9798400703348
Event 16th ACM Web Science Conference 2024
Pages (from-to) 290-295
Publisher New York, New York: Association for Computing Machinery
Organisations
  • Faculty of Social and Behavioural Sciences (FMG) - Amsterdam School of Communication Research (ASCoR)
Abstract The web is a major source for news and information. Yet, news can perpetuate and amplify biases and stereotypes. Prior work has shown that training static word embeddings can expose such biases. In this short paper, we apply both a conventional Word2Vec approach as well as a more modern BERT-based approach to a large corpus of Dutch news. We demonstrate that both methods expose ethnic biases in the news corpus. We also show that the biases in the news corpus are considerably stronger than the biases in the transformer model itself.
Document type Conference contribution
Note With supplemental material
Language English
Published at https://doi.org/10.1145/3614419.3643994
Downloads
3614419.3643994 (Final published version)
Supplementary materials
Permalink to this page
Back