Time-aware authorship attribution for short text streams
| Authors | |
|---|---|
| Publication date | 2015 |
| Book title | SIGIR 2015 |
| Book subtitle | proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval: August 9-13, 2015, Santiago, Chile |
| ISBN (electronic) |
|
| Event | 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2015 |
| Pages (from-to) | 727-730 |
| Publisher | New York, NY: Association for Computing Machinery |
| Organisations |
|
| Abstract |
Identifying authors of short texts on Internet or social media based communication systems is an important tool against fraud and cybercrimes. Besides the challenges raised by the limited length of these short messages, evolving language and writing styles of authors of these texts makes authorship attribution difficult. Most current short text authorship attribution approaches only address the challenge of limited text length. However, neglecting the second challenge may lead to poor performance of authorship attribution for authors who change their writing styles.
In this paper, we analyse the temporal changes of word usage by authors of tweets and emails and based on this analysis we propose an approach to estimate the dynamicity of authors' word usage. The proposed approach is inspired by time-aware language models and can be employed in any time-unaware authorship attribution method. Our experiments on Tweets and the Enron email dataset show that the proposed time-aware authorship attribution approach significantly outperforms baselines that neglect the dynamicity of authors. |
| Document type | Conference contribution |
| Language | English |
| Published at | https://doi.org/10.1145/2766462.2767799 |
| Downloads |
p727-azarbonyad
(Final published version)
|
| Permalink to this page | |
