Entity retrieval has seen a lot of interest from the research community over the past decade. Ten years ago, the expertise
retrieval task gained popularity in the research community during the TREC Enterprise Track . It has remained relevant
ever since, while broadening to social media, to tracking the dynamics of expertise [1-5, 8, 11], and, more generally, to
a range of entity retrieval tasks.
In the talk, which will be given by the second author, we will point out that
existing methods to entity or expert retrieval fail to address key challenges: (1) Queries and expert documents use different
representations to describe the same concepts [6, 7]. Term mismatches between entities and experts  occur due to the inability
of widely used maximum-likelihood language models to make use of semantic similarities between words . (2) As the amount
of available data increases, the need for more powerful approaches with greater learning capabilities than smoothed maximum-likelihood
language models is obvious . (3) Supervised methods for entity or expertise retrieval [5, 8] were introduced at the turn
of the last decade. However, the acceleration of data availability has the major disadvantage that, in the case of supervised
methods, manual annotation efforts need to sustain a similar order of growth. This calls for the further development of unsupervised
methods. (4) According to some entity or expertise retrieval methods, a language model is constructed for every document in
the collection. These methods lack efficient query capabilities for large document collections, as each query term needs to
be matched against every document .
In the talk we will discuss a recently proposed solution  that has a strong
emphasis on unsupervised model construction, efficient query capabilities and, most importantly, semantic matching between
query terms and candidate entities. We show that the proposed approach improves retrieval performance compared to generative
language models mainly due to its ability to perform semantic matching . The proposed method does not require any annotations
or supervised relevance judgments and is able to learn from raw textual evidence and document-candidate associations alone.
purpose of the proposal is to provide insight in how we avoid explicit annotations and feature engineering and still obtain
semantically meaningful retrieval results. In the talk we will provide a comparative error analysis between the proposed semantic
entity retrieval model and traditional generative language models that perform exact matching, which yields important insights
in the relative strengths of semantic matching and exact matching for the expert retrieval task in particular and entity retrieval
We will also discuss extensions of the proposed model that are meant to deal with scalability and dynamic
aspects of entity and expert retrieval.