- Semantic Entities
- Workshop on Exploiting Semantic Annotations in Information Retrieval
- Book/source title
- ESAIR’15: proceedings of the Eighth Workshop on Exploiting Semantic Annotations in Information Retrieval
- Pages (from-to)
- New York, NY: Association for Computing Machinery
- Document type
- Conference contribution
- Faculty of Science (FNWI)
- Informatics Institute (IVI)
Entity retrieval has seen a lot of interest from the research community over the past decade. Ten years ago, the expertise retrieval task gained popularity in the research community during the TREC Enterprise Track . It has remained relevant ever since, while broadening to social media, to tracking the dynamics of expertise [1-5, 8, 11], and, more generally, to a range of entity retrieval tasks.
In the talk, which will be given by the second author, we will point out that existing methods to entity or expert retrieval fail to address key challenges: (1) Queries and expert documents use different representations to describe the same concepts [6, 7]. Term mismatches between entities and experts  occur due to the inability of widely used maximum-likelihood language models to make use of semantic similarities between words . (2) As the amount of available data increases, the need for more powerful approaches with greater learning capabilities than smoothed maximum-likelihood language models is obvious . (3) Supervised methods for entity or expertise retrieval [5, 8] were introduced at the turn of the last decade. However, the acceleration of data availability has the major disadvantage that, in the case of supervised methods, manual annotation efforts need to sustain a similar order of growth. This calls for the further development of unsupervised methods. (4) According to some entity or expertise retrieval methods, a language model is constructed for every document in the collection. These methods lack efficient query capabilities for large document collections, as each query term needs to be matched against every document .
In the talk we will discuss a recently proposed solution  that has a strong emphasis on unsupervised model construction, efficient query capabilities and, most importantly, semantic matching between query terms and candidate entities. We show that the proposed approach improves retrieval performance compared to generative language models mainly due to its ability to perform semantic matching . The proposed method does not require any annotations or supervised relevance judgments and is able to learn from raw textual evidence and document-candidate associations alone.
The purpose of the proposal is to provide insight in how we avoid explicit annotations and feature engineering and still obtain semantically meaningful retrieval results. In the talk we will provide a comparative error analysis between the proposed semantic entity retrieval model and traditional generative language models that perform exact matching, which yields important insights in the relative strengths of semantic matching and exact matching for the expert retrieval task in particular and entity retrieval in general.
We will also discuss extensions of the proposed model that are meant to deal with scalability and dynamic aspects of entity and expert retrieval.
- go to publisher's site
- Keynote paper
If you believe that digital publication of certain material infringes any of your rights or (privacy) interests, please let the Library know, stating your reasons. In case of a legitimate complaint, the Library will make the material inaccessible and/or remove it from the website. Please Ask the Library, or send a letter to: Library of the University of Amsterdam, Secretariat, Singel 425, 1012 WP Amsterdam, The Netherlands. You will be contacted as soon as possible.