An exploration of learning to link with Wikipedia: features, methods and training collection
| Authors | |
|---|---|
| Publication date | 2010 |
| Host editors |
|
| Book title | Focused Retrieval and Evaluation |
| Book subtitle | 8th International Workshop of the Initiative for the Evaluation of XML Retrieval, INEX 2009, Brisbane, Australia, December 7-9, 2009 : revised and selected papers |
| ISBN |
|
| ISBN (electronic) |
|
| Series | Lecture Notes in Computer Science |
| Event | 8th International Workshop of the Initiative for the Evaluation of XML Retrieval (INEX 2009), Brisbane, Australia |
| Pages (from-to) | 324-330 |
| Publisher | Berlin: Springer |
| Organisations |
|
| Abstract |
We describe our participation in the Link-the-Wiki track at INEX 2009. We apply machine learning methods to the anchor-to-best-entry-point task and explore the impact of the following aspects of our approaches: features, learning methods as well as the collection used for training the models. We find that a learning to rank-based approach and a binary classification approach do not differ a lot. The new Wikipedia collection which is of larger size and which has more links than the collection previously used, provides better training material for learning our models. In addition, a heuristic run which combines the two intuitively most useful features outperforms machine learning based runs, which suggests that a further analysis and selection of features is necessary.
|
| Document type | Conference contribution |
| Language | English |
| Published at | https://doi.org/10.1007/978-3-642-14556-8_32 |
| Permalink to this page | |
