Focused retrieval and result aggregation with political data

Open Access
Authors
Publication date 2010
Journal Information Retrieval
Volume | Issue number 13 | 5
Pages (from-to) 412-433
Organisations
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract
This paper presents a case-study in which we use a large semi-structured data set consisting of official transcripts of meetings of the Dutch parliament for focused retrieval and result aggregation. Transcripts of meetings are a document genre characterized by a complex narrative structure. The essence is not only what is said, but also by who and to whom. We have notes of more than 40 years of Dutch parliamentary debates where this structure is exploited to automatically make semantic annotations. These annotations yield numerous new ways of searching, browsing, mining and summarizing these documents. Concerning result aggregation, we summarise and visualise the structure of meetings into tables of content and interruption graphs. The contents of meetings or parts of meetings are condensed into word clouds that are created using a parsimonious language model. Furthermore, we have developed a search engine that exploits the structure and annotations of our data making it possible to provide entry points, to group search results, and to use faceted search techniques for data-exploration. Evaluation shows that our content and structure summarization tools provide a good first impression of a debate. Users reported that, compared to a standard document retrieval system, our search engine gives a better overview of the data. Search tasks are performed faster and the users felt more certain of their answers.
Document type Article
Note In: Special Issue on Focused Retrieval and Result Aggregation.
Language English
Published at https://doi.org/10.1007/s10791-010-9130-z
Downloads
332671.pdf (Final published version)
Permalink to this page
Back