Much of what is discussed in social media is inspired by events in the news and, vice versa, social media provide us with
a handle on the impact of news events. We address the following linking task: given a news article, find social media utterances
that implicitly reference it. We follow a three-step approach: we derive multiple query models from a given source news article,
which are then used to retrieve utterances from a target social media index, resulting in multiple ranked lists that we then
merge using data fusion techniques. Query models are created by exploiting the structure of the source article and by using
explicitly linked social media utterances that discuss the source article. To combat query drift resulting from the large
volume of text, either in the source news article itself or in social media utterances explicitly linked to it, we introduce
a graph-based method for selecting discriminative terms.
For our experimental evaluation, we use data from Twitter,
Digg, Delicious, the New York Times Community, Wikipedia, and the blogosphere to generate query models. We show that different
query models, based on different data sources, provide complementary information and manage to retrieve different social media
utterances from our target index. As a consequence, data fusion methods manage to significantly boost retrieval performance
over individual approaches. Our graph-based term selection method is shown to help improve both effectiveness and efficiency.