Text Genre Classification Based on Linguistic Complexity Contours Using A Recurrent Neural Network

Open Access
Authors
Publication date 2018
Host editors
  • J. Cassens
  • R. Wegener
  • A. Kofod-Petersen
Book title Proceedings of the Tenth International Workshop Modelling and Reasoning in Context
Book subtitle co-located with the 27th International Joint Conference on Artificial Intelligence (IJCAI 2018) and the 23rd European Conference on Artificial Intelligence (ECAI 2018) : Stockholm, Sweden, July 13, 2018
Series CEUR Workshop Proceedings
Event 10th International Workshop Modelling and Reasoning in Context
Pages (from-to) 56-63
Publisher Aachen: CEUR-WS
Organisations
  • Interfacultary Research - Institute for Logic, Language and Computation (ILLC)
Abstract
Over the last years, there has been an increased interest in the combined use of natural language processing techniques and machine learning algorithms to automatically classify texts on the basis of wide range of features. One class of features that have been successfully employed for a wide range of classification tasks, including native language identification, readability assessment and text genre categorization pertain to the construct of ‘linguistic complexity’. This paper presents a novel approach to the use of linguistic complexity features in text categorization: Rather than representing text complexity ‘globally’ in terms of summary statistics, this approach assesses text complexity ‘locally’ and captures the progression of complexity within a text as a sequence of complexity scores, generating what is referred to here as ‘complexity contours’. We demonstrate the utility of the approach in an automatic text classification task for five genres – academic, newspaper, fiction, magazine and spoken – of the Corpus of Contemporary American English (COCA) [Davies, 2008] using a recurrent neural network.
Document type Conference contribution
Language English
Published at http://ceur-ws.org/Vol-2134/paper12.pdf
Other links http://ceur-ws.org/Vol-2134/
Downloads
paper12 (Final published version)
Permalink to this page
Back