Generating scientific documentation for computational experiments using provenance

Authors
Publication date 2015
Host editors
  • B. Ludäscher
  • B. Plale
Book title Provenance and Annotation of Data and Processes
Book subtitle 5th International Provenance and Annotation Workshop, IPAW 2014, Cologne, Germany, June 9-13, 2014 : revised selected papers
ISBN
  • 9783319164618
ISBN (electronic)
  • 9783319164625
Series Lecture Notes in Computer Science
Event 5th International Provenance and Annotation Workshop
Pages (from-to) 168-179
Publisher Cham: Springer
Organisations
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract
lectronic notebooks are a common mechanism for scientists to document and investigate their work. With the advent of tools such as IPython Notebooks and Knitr, these notebooks allow code and data to be mixed together and published online. However, these approaches assume that all work is done in the same notebook environment. In this work, we look at generating notebook documentation from multi-environment workflows by using provenance represented in the W3C PROV model.

Specifically, using PROV generated from the Ducktape workflow system, we are able to generate IPython notebooks that include results tables, provenance visualizations as well as references to the software and datasets used. The notebooks are interactive and editable, so that the user can explore and analyze the results of the experiment without re-running the workflow.

We identify specific extensions to PROV necessary for facilitating documentation generation. To evaluate, we recreate the documentation website for a paper which won the Open Science Award at the ECML/PKDD 2013 machine learning conference. We show that the documentation produced automatically by our system provides more detail and greater experimental insight than the original hand-crafted documentation. Our approach bridges the gap between user friendly notebook documentation and provenance generated by distributed heterogeneous components.
Document type Conference contribution
Language English
Published at https://doi.org/10.1007/978-3-319-16462-5_13
Permalink to this page
Back