Linguistic variation in online communities: A computational perspective

Open Access
Authors
Supervisors
Cosupervisors
Award date 06-11-2020
ISBN
  • 9789464210637
Series ILLC dissertation series, DS-2020-11
Number of pages 166
Publisher Amsterdam: Institute for Logic, Language and Computation
Organisations
  • Interfacultary Research - Institute for Logic, Language and Computation (ILLC)
  • Faculty of Science (FNWI)
Abstract
The same word can be used by different people to mean different things. The observed meaning variation is not random, but determined by the social characteristics of the speakers using it. In particular, a crucial factor in determining the observed variation is the community individuals belong to. This thesis investigates meaning variation in online communities of speakers with a twofold goal: providing an empirical account of the phenomenon in online setups, and leveraging it to improve the performance of NLP models.
I build on theoretical frameworks introduced in Linguistics and Sociolinguistics which describe meaning variation in offline communities. To investigate variation using digital data from online communities, I leverage the tools and methodologies developed in the fields of Natural Language Processing and Computational Linguistics.
The thesis consists of two parts. The first part focuses on the general research question: how to identify and represent meaning variation in online communities of speakers? In the second part, I take a task-oriented approach, as I address the question: how can social information be used to improve the performance of NLP models?
Overall, this dissertation presents an extensive study of meaning variation in online communities of speakers, making two main contributions: First, it contributes empirical confirmation of the findings of traditional sociolinguistic studies and provides new theoretical insights about meaning variation in online communities of speakers. Second, it introduces new methodologies which, by leveraging information about the social context where language is produced, help to improve the performance of NLP systems for text classification.
Document type PhD thesis
Language English
Downloads
Permalink to this page
cover
Back