Significant Words Representations of Entities

M. Dehghani

doi:https://doi.org/10.1145/2911451.2911474

Significant Words Representations of Entities

Authors	M. Dehghani
Publication date	2016
Book title	SIGIR'16
Book subtitle	the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval: Pisa, Italy , July 17-21, 2016
ISBN (electronic)	9781450340694
Event	SIGIR 2016: 39th international ACM SIGIR conference on Research and development in information retrieval
Pages (from-to)	1183
Number of pages	1
Publisher	New York, NY: Association for Computing Machinery
Organisations	Faculty of Science (FNWI) Interfacultary Research - Institute for Logic, Language and Computation (ILLC)
Abstract	Transforming the data into a suitable representation is the first key step of data analysis, and the performance of any data-oriented method is heavily depending on it. We study questions on how we can best learn representations for textual entities that are: 1) precise, 2) robust against noisy terms, 3) transferable over time, and 4) interpretable by human inspection. Inspired by the early work of Luhn[1], we propose significant words language models of a set of documents that capture all, and only, the significant shared terms from them. We adjust the weights of common terms that are already well explained by the document collection as well as the weight of incidental rare terms that are only explained by specific documents, which eventually results in having only the significant terms left in the model.
Document type	Conference contribution
Language	English
Published at	https://doi.org/10.1145/2911451.2911474 (Final published version)
Downloads	p1183-dehghani (Final published version)
Permalink to this page

Back

UvA-DARE

Digital Academic Repository

Significant Words Representations of Entities