Applying automatically parsed corpora to the study of language variation
| Authors | |
|---|---|
| Publication date | 2014 |
| Host editors |
|
| Book title | COLING 2014: the 25th International Conference on Computational Linguistics |
| Book subtitle | proceedings of COLING 2014 : technical papers: August 23-29, 2014, Dublin, Ireland |
| ISBN |
|
| Event | COLING 2014 |
| Pages (from-to) | 1974-1984 |
| Publisher | Sroudsburg, PA: Association for Computational Linguistics |
| Organisations |
|
| Abstract |
In this work, we discuss the benefits of using automatically parsed corpora to study language variation. The study of language variation is an area of linguistics in which quantitative methods have been particularly successful. We argue that the large datasets that can be obtained using automatic annotation can help drive further research in this direction, providing sufficient data for the increasingly complex models used to describe variation. We demonstrate this by replicating and extending a previous quantitative variation study that used manually and semi-automatically annotated data.
We show that while the study cannot be replicated completely due to limitations of the existing automatic annotation, we can draw at least the same conclusions as the original study. In addition, we demonstrate the flexibility of this method by extending the findings to related linguistic constructions and to another domain of text, using additional data. |
| Document type | Conference contribution |
| Language | English |
| Published at | http://www.aclweb.org/anthology/C14-1186 |
| Downloads |
C14-1186
(Final published version)
|
| Permalink to this page | |
