- Automatic assistants for database exploration
- Award date
- 3 November 2016
- Number of pages
- Document type
- PhD thesis
- Faculty of Science (FNWI)
- Informatics Institute (IVI)
Data explorers interrogate a database to discover its content. Their aim is to get an overview of the data and discover interesting new facts. They have little to no knowledge of the data, and their requirements are often vague and abstract. How can such users write database queries? This thesis presents four assistants to help them through this task: Claude, Blaeu, Ziggy and Raimond. Each assistant focuses on a specific exploration task. Claude helps users analyze data warehouses, by highlighting the combinations of variables which influence a predefined measure of interest. Blaeu helps users build and refine queries, by allowing them to select and project clusters of tuples. Ziggy is a tuple characterization engine: its aim is to show what makes a selection of tuples unique, by highlighting the differences between those and the rest of the database. Finally, Raimond is an attempt to generalize semi-automatic exploration to text data, inspired by an industrial use case. For each system, we present a user model, that is, a formalized set of assumptions about the users’ goals. We then present practical methods to make recommendations. We either adapt existing algorithms from the machine learning literature or present our own. Next, we validate our approaches with experiments. We present use cases in which our systems led to discoveries, and we benchmark their speed, quality and robustness.
- Subtitle on the back cover: Introducing Claude, Blaeu, Ziggy and Raimond, four assistants for data exploration.
Research conducted at: Universiteit van Amsterdam
Series: SIKS dissertation series 2016-44