Effective Translation, Tokenization and Combination for Cross-Lingual Retrieval

Open Access
Authors
Publication date 2005
Host editors
  • C. Peters
  • P.D. Clough
  • G.J.F. Jones
  • J. Gonzalo
  • M. Kluck
  • B. Magnini
Book title Multilingual information access for text, speech and images: 5th workshop of the cross-language evaluation forum, CLEF 2004, Bath, UK, September 15-17, 2004: revised selected papers
ISBN
  • 9783540274209
Series LNCS, 3491
Pages (from-to) 123-135
Number of pages 13
Publisher Berlin: Springer
Organisations
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
  • Interfacultary Research - Institute for Logic, Language and Computation (ILLC)
Abstract
Our approach to cross-lingual document retrieval starts from the assumption that effective monolingual retrieval is at the core of any cross-language retrieval system. We devote particular attention to three crucial ingredients of our approach to cross-lingual retrieval. First, effective
tokenization techniques are essential to cope with morphological variations common in many European languages. Second, effective combination methods allow us to combine the best of different strategies. Finally, effective translation methods for translating queries or documents
turn a monolingual retrieval system into a cross-lingual retrieval system proper. The viability of our approach is shown by a series of experiments in monolingual, bilingual, and multilingual retrieval.
Document type Chapter
Downloads
Permalink to this page
Back