Beyond counting words Assessing performance of dictionaries, supervised machine learning, and embeddings in topic and frame classification

Open Access
Authors
Publication date 10-2022
Journal Computational Communication Research
Volume | Issue number 4 | 2
Pages (from-to) 528-570
Organisations
  • Faculty of Social and Behavioural Sciences (FMG) - Amsterdam School of Communication Research (ASCoR)
Abstract
Topics and frames are at the heart of various theories in communication science and other social sciences, making their measurement of key interest to many scholars. The current study compares and contrasts two main deductive computational approaches to measure policy topics and frames: Dictionary (lexicon) based identification, and supervised machine learning. Additionally, we introduce domain-specific word embeddings to these classification tasks. Drawing on a manually coded dataset of Dutch news articles and parliamentary questions, our results indicate that supervised machine learning outperforms dictionary-based classification for both tasks. Furthermore, results show that word embeddings may boost performance at relatively low cost by introducing relevant and domain-specific semantic information to the classification model.
Document type Article
Language English
Published at https://doi.org/10.5117/CCr2022.2.006.Kroo
Downloads
Beyond counting words (Final published version)
Permalink to this page
Back