We address the challenges rising from the exchange of large data volumes between web services in scientific applications by proposing a data pipeline model to reduce workflow execution times and demand for storage and network resources.
We also propose a data resource federation architecture which is technology agnostic and enables a unified view of the data resources providing an abstraction layer under which independent data storage resources are coordinated To address scalability and performance issues we have extended our initial data management architecture by introducing modules that can be deployed on multiple and heterogeneous infrastructures. Next, we investigated how programmable networks can reduce the execution time of data and I/O intensive workflows.
To demonstrate the usage of the proposed methods and tools, we have applied them to real world applications. The performance and usability of our data pipeline model for web services is evaluated with two workflows. Next, we applied our storage federation approach to a well known data-intensive workflow based on Montage. Finally, we analyze usage data of our storage federation approach coming from the VPH-Share project infrastructure which is used for executing medical applications.
If you believe that digital publication of certain material infringes any of your rights or (privacy) interests, please let the Library know, stating your reasons. In case of a legitimate complaint, the Library will make the material inaccessible and/or remove it from the website. Please Ask the Library, or send a letter to: Library of the University of Amsterdam, Secretariat, Singel 425, 1012 WP Amsterdam, The Netherlands. You will be contacted as soon as possible.