CAMsterdam at SemEval-2019 Task 6 Neural and graph-based feature extraction for the identification of offensive tweets

Open Access
Authors
  • G. Aglionby
  • C. Davis
  • P. Mishra
  • A. Caines
  • H. Yannakoudakis
  • M. Rei
  • E. Shutova
  • P. Buttery
Publication date 2019
Host editors
  • J. May
  • E. Shutova
  • A. Herbelot
  • X. Zhu
  • M. Apidianaki
  • S.M. Mohammad
Book title The International Workshop on Semantic Evaluation : Proceedings of the Thirteenth Workshop
Book subtitle NAACL HLT 2019 : June 6-June 7, 2019, Minneapolis, Minnesota, USA
ISBN (electronic)
  • 9781950737062
Event 13th International Workshop on Semantic Evaluation, SemEval 2019, co-located with the 17th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2019
Pages (from-to) 556-563
Number of pages 8
Publisher Stroudsburg, PA: Association for Computational Linguistics
Organisations
  • Interfacultary Research - Institute for Logic, Language and Computation (ILLC)
Abstract

We describe the CAMsterdam team entry to the SemEval-2019 Shared Task 6 on offensive language identification in Twitter data. Our proposed model learns to extract textual features using a multi-layer recurrent network, and then performs text classification using gradient-boosted decision trees (GBDT). A self-attention architecture enables the model to focus on the most relevant areas in the text. We additionally learn globally optimised embeddings for hashtags using node2vec, which are given as additional tweet features to the GBDT classifier. Our best model obtains 78.79% macro F1-score on detecting offensive language (subtask A), 66.32% on categorising offence types (targeted/untargeted; subtask B), and 55.36% on identifying the target of offence (subtask C).

Document type Conference contribution
Language English
Published at https://doi.org/10.18653/v1/S19-2100
Other links https://www.scopus.com/pages/publications/85093413882
Downloads
S19-2100 (Final published version)
Permalink to this page
Back