- Heuristic ranking and diversification of web documents
- NIST Special Publication
- Document type
- Faculty of Science (FNWI)
- Informatics Institute (IVI)
We describe the participation of the University of Amsterdam’s Intelligent Systems Lab in the web track at TREC 2009. We participated in the adhoc and diversity task. We find that spam is an important issue in the ad hoc task and that Wikipedia-based heuristic optimization approaches help to boost the retrieval performance, which is assumed to potentially reduce spam in the top ranked results. As for the diversity task, we explored different methods. Clustering and a topic model-based approach have a similar performance and both are relatively better than a query log based approach.
- Proceedings title: Proceedings of the Eighteenth Text REtrieval Conference (TREC 2009)
Place of publication: Gaithersburg, MD
Editors: E.M. Voorhees, L.P. Buckland
If you believe that digital publication of certain material infringes any of your rights or (privacy) interests, please let the Library know, stating your reasons. In case of a legitimate complaint, the Library will make the material inaccessible and/or remove it from the website. Please Ask the Library, or send a letter to: Library of the University of Amsterdam, Secretariat, Singel 425, 1012 WP Amsterdam, The Netherlands. You will be contacted as soon as possible.