Linking archives using document enrichment and term selection

Authors
Publication date 2011
Host editors
  • S. Gradmann
  • F. Borri
  • C. Meghini
  • H. Schuldt
Book title Research and Advanced Technology for Digital Libraries
Book subtitle International Conference on Theory and Practice of Digital Libraries, TPDL 2011, Berlin, Germany, September 26-28, 2011: proceedings
ISBN
  • 9783642244681
ISBN (electronic)
  • 9783642244698
Series Lecture Notes in Computer Science
Event TPDL 2011: International Conference on Theory and Practice of Digital Libraries 2011
Pages (from-to) 360-371
Publisher Heidelberg: Springer
Organisations
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract
News, multimedia and cultural heritage archives are increasingly offering opportunities to create connections between their collections. We consider the task of linking archives: connecting an item in one archive to one or more items in other, often complementary archives. We focus on a specific instance of the task: linking items with a rich textual representation in a news archive to items with sparse annotations in a multimedia archive, where items should be linked if they describe the same or a related event. We find that the difference in textual richness of annotations presents a challenge and investigate two approaches: (i) to enrich sparsely annotated items with textually rich content; and (ii) to reduce rich news archive items using term selection. We demonstrate the positive impact of both approaches on linking to same events and linking to related events.
Document type Conference contribution
Language English
Published at https://doi.org/10.1007/978-3-642-24469-8_37
Permalink to this page
Back