ChatGPT-4 Outperforms Experts and Crowd Workers in Annotating Political Twitter Messages with Zero-Shot Learning

Open Access
Authors
Publication date 13-04-2023
Edition v1
Number of pages 5
Publisher ArXiv
Organisations
  • Faculty of Social and Behavioural Sciences (FMG) - Amsterdam Institute for Social Science Research (AISSR)
  • Interfacultary Research - Institute for Logic, Language and Computation (ILLC)
Abstract
This paper assesses the accuracy, reliability, and bias of the Large Language Model (LLM) ChatGPT-4 on the text analysis task of classifying the political affiliation of a Twitter poster based on the content of a tweet. The LLM is compared to manual annotation by both expert classifiers and crowd workers, generally considered the gold standard for such tasks. We use Twitter messages from United States politicians during the 2020 election, providing a ground truth against which to measure accuracy. The paper finds that ChatGPT-4 has achieved higher accuracy, higher reliability, and equal or lower bias than the human classifiers. The LLM is able to correctly annotate messages that require reasoning on the basis of contextual knowledge and inferences around the author’s intentions—traditionally seen as uniquely human abilities. These findings suggest that LLM will have a substantial impact on the use of textual data in the social sciences, by enabling interpretive research at a scale.
Document type Preprint
Language English
Published at https://doi.org/10.48550/arXiv.2304.06588
Downloads
2304.06588v1 (Final published version)
Permalink to this page
Back