The University of Southampton
University of Southampton Institutional Repository

Connecting Scientific Data to Scientific Experiments with Provenance

Connecting Scientific Data to Scientific Experiments with Provenance
Connecting Scientific Data to Scientific Experiments with Provenance
As scientific workflows, and the data they operate on, grow in size and complexity, the task of defining how those workflows should execute (which resources they should use, where those resources should be in preparation for processing etc.) becomes proportionally more difficult. While 'workflow compilers', such as Pegasus, aid greatly in reducing this burden, a further problem arises: as specifying the details of execution is now automatic, a workflow's results are harder to interpret, as they are in part due to the specifics of execution. By automating the steps between the original experiment design and its results, we lose the connection between them, making results harder to interpret. To reconnect the scientific data with the original experiment, we argue that scientists should have access to the full provenance of their data, including not only parameters, input data and intermediary results, but also the abstract experiment, refined into a concrete execution by the 'workflow compiler'. In this paper, we describe our preliminary work on adapting Pegasus to capture the process of workflow refinement in the PASOA provenance system.
179-186
Miles, Simon
76c81b8e-1ca1-4d6d-ace3-922f03df97e0
Deelman, Ewa
a4e70674-2af5-465e-9d86-989ceccd3f2d
Groth, Paul
427b9eca-c4dd-45c1-be04-3c91bb327345
Vahi, Karan
b22cc478-6a69-43d9-b135-3760e546ba40
Mehta, Gaurang
91a7b27e-e7e0-467d-a791-4fbdcdb55f6f
Moreau, Luc
033c63dd-3fe9-4040-849f-dfccbe0406f8
Miles, Simon
76c81b8e-1ca1-4d6d-ace3-922f03df97e0
Deelman, Ewa
a4e70674-2af5-465e-9d86-989ceccd3f2d
Groth, Paul
427b9eca-c4dd-45c1-be04-3c91bb327345
Vahi, Karan
b22cc478-6a69-43d9-b135-3760e546ba40
Mehta, Gaurang
91a7b27e-e7e0-467d-a791-4fbdcdb55f6f
Moreau, Luc
033c63dd-3fe9-4040-849f-dfccbe0406f8

Miles, Simon, Deelman, Ewa, Groth, Paul, Vahi, Karan, Mehta, Gaurang and Moreau, Luc (2007) Connecting Scientific Data to Scientific Experiments with Provenance. Proceedings of the third IEEE International Conference on e-Science and Grid Computing (e-Science'07). pp. 179-186 .

Record type: Conference or Workshop Item (Paper)

Abstract

As scientific workflows, and the data they operate on, grow in size and complexity, the task of defining how those workflows should execute (which resources they should use, where those resources should be in preparation for processing etc.) becomes proportionally more difficult. While 'workflow compilers', such as Pegasus, aid greatly in reducing this burden, a further problem arises: as specifying the details of execution is now automatic, a workflow's results are harder to interpret, as they are in part due to the specifics of execution. By automating the steps between the original experiment design and its results, we lose the connection between them, making results harder to interpret. To reconnect the scientific data with the original experiment, we argue that scientists should have access to the full provenance of their data, including not only parameters, input data and intermediary results, but also the abstract experiment, refined into a concrete execution by the 'workflow compiler'. In this paper, we describe our preliminary work on adapting Pegasus to capture the process of workflow refinement in the PASOA provenance system.

Text
escience07.pdf - Accepted Manuscript
Download (298kB)

More information

Published date: December 2007
Venue - Dates: Proceedings of the third IEEE International Conference on e-Science and Grid Computing (e-Science'07), 2007-12-01
Organisations: Web & Internet Science

Identifiers

Local EPrints ID: 271188
URI: http://eprints.soton.ac.uk/id/eprint/271188
PURE UUID: 2e5f85e4-c44c-457f-9ca6-99f1558a761f
ORCID for Luc Moreau: ORCID iD orcid.org/0000-0002-3494-120X

Catalogue record

Date deposited: 27 May 2010 10:35
Last modified: 14 Mar 2024 09:25

Export record

Contributors

Author: Simon Miles
Author: Ewa Deelman
Author: Paul Groth
Author: Karan Vahi
Author: Gaurang Mehta
Author: Luc Moreau ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×