Link detection with Wikipedia

Authors
Publication date 2009
Host editors
  • S. Geva
  • J. Kamps
  • A. Trotman
Book title Advances in Focused Retrieval
Book subtitle 7th International Workshop of the Initiative for the Evaluation of XML Retrieval, INEX 2008, Dagstuhl Castle, Germany, December 15-18, 2008 : revised and selected papers
ISBN
  • 9783642037603
ISBN (electronic)
  • 9783642037610
Series Lecture Notes in Computer Science
Event 7th International Workshop of the Initiative for the Evaluation of XML Retrieval (INEX 2008), Dagstuhl Castle, Germany
Pages (from-to) 366-373
Publisher Berlin: Springer
Organisations
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract
This paper describes our participation in the INEX 2008 Link the Wiki track. We focused on the file-to-file task and submitted three runs, which were designed to compare the impact of different features on link generation. For outgoing links, we introduce the anchor likelihood ratio as an indicator for anchor detection, and explore two types of evidence for target identification, namely, the title field evidence and the topic article content evidence. We find that the anchor likelihood ratio is a useful indicator for anchor detection, and that in addition to the title field evidence, re-ranking with the topic article content evidence is effective for improving target identification. For incoming links, we use exact match and retrieval method with language modeling approach, and find that the exact match approach works best. On top of that, our experiment shows that the semantic relatedness between Wikipedia articles also has certain ability to indicate links.
Document type Conference contribution
Language English
Published at https://doi.org/10.1007/978-3-642-03761-0_37
Permalink to this page
Back