An exploration of learning to link with Wikipedia: features, methods and training collection

Authors
Publication date 2010
Host editors
  • S. Geva
  • J. Kamps
  • A. Trotman
Book title Focused Retrieval and Evaluation
Book subtitle 8th International Workshop of the Initiative for the Evaluation of XML Retrieval, INEX 2009, Brisbane, Australia, December 7-9, 2009 : revised and selected papers
ISBN
  • 9783642145551
ISBN (electronic)
  • 9783642145568
Series Lecture Notes in Computer Science
Event 8th International Workshop of the Initiative for the Evaluation of XML Retrieval (INEX 2009), Brisbane, Australia
Pages (from-to) 324-330
Publisher Berlin: Springer
Organisations
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract
We describe our participation in the Link-the-Wiki track at INEX 2009. We apply machine learning methods to the anchor-to-best-entry-point task and explore the impact of the following aspects of our approaches: features, learning methods as well as the collection used for training the models. We find that a learning to rank-based approach and a binary classification approach do not differ a lot. The new Wikipedia collection which is of larger size and which has more links than the collection previously used, provides better training material for learning our models. In addition, a heuristic run which combines the two intuitively most useful features outperforms machine learning based runs, which suggests that a further analysis and selection of features is necessary.
Document type Conference contribution
Language English
Published at https://doi.org/10.1007/978-3-642-14556-8_32
Permalink to this page
Back