Hierarchical multi-label classification of social text streams

Authors
Publication date 2014
Book title SIGIR '14
Book subtitle proceedings of the 37th International ACM SIGIR Conference on Research and Development in Information Retrieval: July 6-11 2014, Gold Coast, Queensland, Australia
ISBN
  • 9781450322577
ISBN (electronic)
  • 9781450322591
Event SIGIR '14: 37th international ACM SIGIR conference on Research and development in information retrieval
Pages (from-to) 213-222
Publisher New York, NY: ACM
Organisations
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract
Hierarchical multi-label classification assigns a document to multiple hierarchical classes. In this paper we focus on hierarchical multi-label classification of social text streams. Concept drift, complicated relations among classes, and the limited length of documents in social text streams make this a challenging problem. Our approach includes three core ingredients: short document expansion, time-aware topic tracking, and chunk-based structural learning. We extend each short document in social text streams to a more comprehensive representation via state-of-the-art entity linking and sentence ranking strategies. From documents extended in this manner, we infer dynamic probabilistic distributions over topics by dividing topics into dynamic "global" topics and "local" topics. For the third and final phase we propose a chunk-based structural optimization strategy to classify each document into multiple classes. Extensive experiments conducted on a large real-world dataset show the effectiveness of our proposed method for hierarchical multi-label classification of social text streams.
Document type Conference contribution
Language English
Published at https://doi.org/10.1145/2600428.2609595
Permalink to this page
Back