Fast, Explainable View Detection to Characterize Exploration Queries

Authors
Publication date 2016
Host editors
  • P. Baumann
  • I. Manolescu-Goujot
  • L. Trani
  • Y. Ioannidis
  • G.G. Barnaföldi
  • L. Dobos
  • E. Bányai
Book title Scientific and Statistical Database Management
Book subtitle 28th International Conference, SSDBM 2016 : Budapest, Hungary, July 2016 : proceedings
ISBN (electronic)
  • 9781450342155
Event 28th International Conference on Scientific and Statistical Database Management
Article number 20
Number of pages 12
Publisher New York, NY: The Association for Computing Machinery
Organisations
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract
The aim of data exploration is to get acquainted with an unfamiliar database. Typically, explorers operate by trial and error: they submit a query, study the result, and refine their query subsequently. In this paper, we investigate how to help them understand their query results. In particular, we focus on medium to high dimension spaces: if the database contains dozens or hundreds of columns, which variables should they inspect? We propose to detect subspaces in which the users' selection is different from the rest of the database. From this idea, we built Ziggy, a tuple description engine. Ziggy can detect informative subspaces, and it can explain why it recommends them, with visualizations and natural language. It can cope with mixed data, missing values, and it penalizes redundancy. Our experiments reveal that it is up to an order of magnitude faster than state-of-the-art feature selection algorithms, at minimal accuracy costs.
Document type Conference contribution
Language English
Published at https://doi.org/10.1145/2949689.2949692
Other links https://ivi.fnwi.uva.nl/isis/publications/2016/SellamSSDBM2016
Permalink to this page
Back